The presentation gives an overview of what metadata is and why it is important. It also addresses the benefits that metadata can bring and offers advice and tips on how to produce good quality metadata and, to close, how EUDAT uses metadata in the B2FIND service.
November 2016
a brief overview and introduction to metadata from how it is used on the web (including seo and tagging) to its use in Flickr and library catalogs by robin fay, georgiawebgurl@gmail.com.
a brief overview and introduction to metadata from how it is used on the web (including seo and tagging) to its use in Flickr and library catalogs by robin fay, georgiawebgurl@gmail.com.
Whenever you make a list of anything – list of groceries to buy, books to borrow from the library, list of classmates, list of relatives or friends, list of phone numbers and so o – you are actually creating a database.
An example of a business manual database may consist of written records on a paper and stored in a filing cabinet. The documents usually organized in chronological order, alphabetical order and so on, for easier access, retrieval and use.
Computer database are those data or information stored in the computer. To arrange and organize records, computer databases rely on database software
Microsoft Access is an example of database software.
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?Bernard Marr
There are three classifications of data: structured, semi-structured and unstructured. While structured data was the type used most often in organizations historically, artificial intelligence and machine learning have made managing and analysing unstructured and semi-structured data not only possible, but invaluable.
What is an Information Society
Why are Information Policies needed
What is an Information Policy
Elements of Information Policy
Who has Information Policies
E-Inclusion
Life Long Learning
E-Business strategies
Infrasture – physical (broadband/e-fibre)
Infrastructure – political / Legal and regulatory
Copyright, Intellectual Property, Data Protection, Freedom of Information
Regulation of Domain Name Spaces ( .ie)
E-government
Information Policy in Ireland
A presentation on Digital Library Architecture (components of digital library) by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Tumakuru, Karnataka, India.
Whenever you make a list of anything – list of groceries to buy, books to borrow from the library, list of classmates, list of relatives or friends, list of phone numbers and so o – you are actually creating a database.
An example of a business manual database may consist of written records on a paper and stored in a filing cabinet. The documents usually organized in chronological order, alphabetical order and so on, for easier access, retrieval and use.
Computer database are those data or information stored in the computer. To arrange and organize records, computer databases rely on database software
Microsoft Access is an example of database software.
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?Bernard Marr
There are three classifications of data: structured, semi-structured and unstructured. While structured data was the type used most often in organizations historically, artificial intelligence and machine learning have made managing and analysing unstructured and semi-structured data not only possible, but invaluable.
What is an Information Society
Why are Information Policies needed
What is an Information Policy
Elements of Information Policy
Who has Information Policies
E-Inclusion
Life Long Learning
E-Business strategies
Infrasture – physical (broadband/e-fibre)
Infrastructure – political / Legal and regulatory
Copyright, Intellectual Property, Data Protection, Freedom of Information
Regulation of Domain Name Spaces ( .ie)
E-government
Information Policy in Ireland
A presentation on Digital Library Architecture (components of digital library) by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Tumakuru, Karnataka, India.
Presentatie Makx Dekkers tijdens mini-symposium 14 september 2006 in den Haag.
Makx Dekkers, de managing director van het succesvolle Dublin Core Metadata Initiative, ging uitgebreid in op de toepassingen van Dublin Core binnen overheden en het bedrijfsleven en de kernaspecten die een rol spelen bij de toepassing van deze standaard.
Bron: http://www.advies.overheid.nl/5581/
metadata & open source #osgeonl dag 2012 pvangenuchten
Een presentatie op osgeo nl dag Velp.
Data providers hebben afgelopen jaar (getriggered door open data, atlas vd leefomgeving en inspire) hard gewerkt om hun data en metadata online te krijgen. Nu is het de beurt aan de client-software om goed met de gebruikte jonge standaarden aan de gang te gaan om de data optimaal te ontsluiten. Enkele use-cases worden gepresenteerd hoe dit zou kunnen.
A new approach to spatial data infrastructure focused around users and collaboration. Development, Implementation and maintenance of a Geonode portal for local and regional collaboration, geospatial data sharing and management, to support evidence-‐based decision making.
[dutch language] Slides bij een gastcollege over het gebruik van website-metadata bij zoekmachineoptimalisatie. Voor de Hogeschool van Amsterdam (HvA), Afdeling MIC (Media, informatie en Communicatie)
GBIF and reuse of research data, Bergen (2016-12-14)Dag Endresen
Biodiversity informatics seminar at the Department of Biology, University of Bergen on data publication and reuse of GBIF-mediated biodiversity data on 14th December 2016. Organized by the Norwegian GBIF Node and the Norwegian Biodiversity Information Center (NBIC, Artsdatabanken).
See also: http://www.gbif.no/events/2016/data-publishing-seminar-in-bergen.html
See also: http://doi.org/10.13140/RG.2.2.24290.32969
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)EUDAT
EUDAT and PRACE joined forces to help research communities gain access to high quality managed e-Infrastructures whose resources can be connected together to enable cross-utilization use cases and make them accessible without any technical barrier. The capability to couple data and compute resources together is considered one of the key factors to accelerate scientific innovation and advance research frontiers. The goal of this session was to present the EUDAT services, the results of the collaboration activity achieved so far and delivers a hands-on on how to write a Data Management Plan or DMP. The DMP is a useful instrument for researchers to reflect on and communicate about the way they will deal with their data. It prompts them to think about how they will generate, analyse and share data during their research project and afterwards.
Visit: https://www.eudat.eu/eudat-summer-school
EUDAT Research Data Management | www.eudat.eu | EUDAT
| www.eudat.eu | The presentation gives an introduction to Research Data Management, explaining why it is important to manage and share data.
November 2016
Overview of the world of geospatial metadata, and the role of the EDINA service GoGeo in creating, saving, and discovering it. Presented on 19 June 2014 by Tony Mathys in Aberdeen, Scotland.
Doing for Data what Pubmed did for literature: DATS a model for dataset description datasets indexing and data discovery.
Googleslides [https://goo.gl/cd5KKa] or Slideshare [https://goo.gl/c8DH5N]
This presentation was provided by Chris Erdmann of Library Carpentries and by Judy Ruttenberg of ARL during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
Reference Model for an Open Archival Information Systems (OAIS): Overview and...faflrt
ALA/FAFLRT Workshop on Open Archival Information Service (OAIS). Presented by Alan Wood/A.E.Wood & Erickson/Lockheed Martin, Don Sawyer/NASA/GSFC, and Lou Reich/CSC. Sponsored by ALA Federal and Armed Forces Libraries Roundtable (FAFLRT). Presented on June 16, 2001 at the ALA Annual Conference.
Urm concept for sharing information inside of communitiesKarel Charvat
The paper describe concept for Sharing Information Inside Communities - Uniform Resource Management (URM), which support validation, discovery and access to heterogeneous information and knowledge. It is based on utilisation of metadata schemes. The URM models currently also integrate different tools, which support sharing of knowledge. The URM concept was introduced by NaturNet Redime project as tool for managing of educational context and now is modified for general sharing of information inside of community in c@r project. The concept is now partly implemented as part of Czech metadata portal, Czech portal for United Nation Spatial Data infrastructure and it is also tested in Latvia by BOCS
A Generic Scientific Data Model and Ontology for Representation of Chemical DataStuart Chalk
The current movement toward openness and sharing of data is likely to have a profound effect on the speed of scientific research and the complexity of questions we can answer. However, a fundamental problem with currently available datasets (and their metadata) is heterogeneity in terms of implementation, organization, and representation.
To address this issue we have developed a generic scientific data model (SDM) to organize and annotate raw and processed data, and the associated metadata. This paper will present the current status of the SDM, implementation of the SDM in JSON-LD, and the associated scientific data model ontology (SDMO). Example usage of the SDM to store data from a variety of sources with be discussed along with future plans for the work.
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)EUDAT
Yann will introduce the notion of data life cycles (DLCs) as an overarching framework for the workshop. This presentation will explain the key activities and roles identified by EUDAT and undertaken by researchers and data service providers in the process of creating, analysing, managing, sharing and archiving research data. It will highlight how the EUDAT service suite addresses this data lifecycle to support researchers with their key data requirements. He will then present the current research work undertaken in EUDAT to model community specific DLCs, the relation with the concept of provenance and the prototype services being currently developed to bridge the identified gaps in DLC coverage.
Visit https://eudat.eu/eudat-summer-school
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
Lesson 7 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
With a network of more than 20 European research
organisations, data and computing centres in 14 countries,
the EUDAT Collaborative Data Infrastructure (CDI) is one of
the largest infrastructures of integrated data services and
resources supporting research in Europe.
Are you a researcher, citizen scientist, institution or community looking for data storage and value-added services? Do you want access to tools to make your research data more FAIR (findable, accessible, interoperable, and reusable)? Interested in seeing how the future European Open Science Cloud could support research data and practically foster cross-border, cross-disciplinary collaboration? Then this webinar is for you!
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Introduction to Metadata
1. EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 www.eudat.eu
Introduction to metadata
Version 2
August 2016
This work is licensed under the Creative
Commons CC-BY 4.0 licence
2. What is metadata and why do we need it?
How to produce good quality metadata?
EUDAT and metadata
Overview
3. WHAT IS
METADATA?
Image CC-BY ‘Metadata is a love note to the future’ by
Cea+ www.flickr.com/photos/ centralasian/8071729256
4. Commonly defined as ‘data about data’, metadata helps to
make data findable and understandable
Metadata can be:
Descriptive: information about the content and context
of the data
Structural: information about the structure of the data
Administrative: information about the file type, rights
management and preservation processes
What is metadata?
5. Comprehensive metadata will:
Facilitate data discovery
Help users determine the applicability of the data
Enable interpretation and reuse
Allow any limitations to be understood
Clarify ownership and restrictions on reuse
Offer permanence as it transcends people and time
Provide interoperability
Why use metadata?
6. Metadata and documentation
Think about what will be needed in order to find, evaluate,
understand, and reuse the data.
Have you documented what you did and how?
Did you develop code to run analyses? If so, this should
be kept and shared too.
Is it clear what each bit of your dataset means? Make
sure the units are labelled and abbreviations explained.
Record all the information needed for you and others to
understand the data in the future
8. Create metadata at the time of data creation
Information will be forgotten and there won’t be time or
effort left to capture it later.
Metadata benefits from quality control at an early stage
too.
Time matters!
Image CC-BY-SA ‘egg timer – hour glass running out’ by OpenDemocracy
www.flickr.com/photos/opendemocracy/523438942
9. GOOD QUALITY METADATA
Image CC-BY ‘Quality’ by Elizabeth Hahn www.flickr.com/photos/128185330@N03/17517769750
10. Use of standards
Controlled vocabularies for unambiguous keywords
Simple, complete and consistent information
Appropriate description
Explanation of limitations to support reuse
Avoid special characters e.g. !@<~ etc...
Provide persistent identifiers such as DOIs
What makes metadata good?
11. The good and the bad
Metres / seconds
2015-09-10T15:00:01+01:00
Longitudinal wind speed
PDF 1.7
2008 US Population statistics
Barcelona, Venezuela
Furlongs and fortnight
10th Sept. 2015 15:00:01
U
PDF
Population statistics
Barcelona
More precise and
standardised Ambiguous
12. Metadata standards
Metadata standards provide a structured way to describe
the data
Information is presented in a reliable and predictable
format which allows for computer interpretation
Use of standards enables data interoperability
13. Metadata Standards Directory
Catalogue initiated by the Digital Curation Centre (DCC)
now maintained as a community initiative via the
Research Data Alliance
www.dcc.ac.uk/resources/metadata-standards
14. There are a number of factors to consider:
Data type – look for standards to suit your data
Community norms – what is accepted and common
practice in your field?
Organisational policies – is one recommended?
Instruments being used – any automated metadata?
What resources are available? – there are tools to create
metadata in certain standards, more instructional
materials and support
How to choose a metadata standard?
15. How to write quality metadata
Organise your information and reuse where possible e.g.
project abstracts, lab notebooks, citations
Write your metadata using a metadata tool
Review for accuracy and completeness
Have someone else read your record
Revise based on comments from your reviewer
Review once more before you publish Draft
ReviewRevise
Review
16. Tips to follow when creating metadata
Do not use jargon
Define technical terms and acronyms:
– CA, LA, GPS, GIS : what do these mean?
Clearly state data limitations
– E.g. data set omissions, completeness of data
– Express considerations for appropriate re-use
Use “none” or “unknown” meaningfully
– None usually means that you knew about data and nothing
existed (e.g., a “0” cubic feet per second discharge value)
– Unknown means that you don’t know whether that data
existed or not (e.g., a null value)
17. Dataset titles
Titles are critical in helping readers find your data
– While individuals are searching for the most appropriate
data sets, they are most likely going to use the title as the
first criteria to determine if a dataset meets their needs.
– Treat the title as the opportunity to sell your dataset.
A complete title includes: What, Where, When, Who, and
Scale
An informative title includes: topic, timeliness of the data,
specific information about place and geography
18. Which is the better title?
Rivers
OR
Greater Yellowstone Rivers from 1:126,700 U.S. Forest
Service Visitor Maps (1961-1983)
Greater Yellowstone (where) Rivers (what) from 1:126,700
(scale) U.S. Forest Service (who) Visitor Maps (1961-
1983) (when)
19. Write for machines, not just humans
Remember: a computer will read your metadata
Do not use symbols that could be misinterpreted:
Examples: ! @ # % { } | / < > ~
Don’t use tabs, indents, or line feeds/carriage returns
When copying and pasting from other sources, use a
text editor (e.g., Notepad) to eliminate hidden characters
20. Could someone use an automatic search to locate the
data?
Can others assess the usefulness of the data?
Could a novice understand it?
Is the metadata specific enough?
Is there enough information to re-use the data?
Is the information unambiguous – are all codes,
abbreviations and variables explained?
Remember to review your metadata!
21. EUDAT AND METADATA
Image CC-BY ‘University of Michigan Library Card Catalog’ by David Fulmer
www.flickr.com/photos/annarbor/4350629792
22. B2FIND is based on a comprehensive joint metadata
catalogue of research data collections stored in EUDAT
data centres and other repositories
It allows researchers or data users to find relevant data,
and supports communities and data providers to increase
visibility of their data
B2FIND provides a simple and user-friendly discovery
service on metadata steadily harvested from a wide
range of research communities
The B2FIND service
b2find.eudat.eu
23. The same term can be used by different disciplines
Species for chemists and zoologists
Andromeda for astronomers and historians
Some domain knowledge is therefore necessary
The EUDAT B2FIND service needs to suit a wide range of
different communities
The interdisciplinary problem
24. Metadata is harvested from different communities,
usually using the OAI-PMH protocol
The metadata (in a wide variety of standards) are
processed to map and transform them to the B2FIND
schema
How the B2FIND service works
INPUT
Metadata in community
standards e.g. DDI,
Dublin Core, CMDI, ISO
19115
OUTPUT
Homogenised metadata
in the B2FIND schema
25. Metadata records in B2FIND
http://b2find.eudat.eu/dataset/3a063891-6952-5bcf-a5ed-46f8a681c1c9
26. For more info: https://eudat.eu/services/b2find
User documentation: https://www.eudat.eu/services/userdoc/b2find-
integration
b2find.eudat.eu
27. www.eudat.eu
Authors Contributors
This work is licensed under the Creative Commons CC-BY 4.0 licence
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures.
Contract No. 654065
Sarah Jones, Digital Curation Centre
Shaun de Witt, STFC
Sara Garavelli, Trust-IT
Thank you
Content has also been repurposed from the DataONE Educational
modules, ‘Metadata’ and ‘How to Write Good Quality Metadata’ Retrieved
from https://www.dataone.org/education-modules
Editor's Notes
This presentation will give an introduction to the concept of metadata, why it is important and how to address this in research projects.
There are three main topics that we will discuss:
What is metadata and why is it important. Here we will think about the benefits that metadata can bring to you and others.
Secondly, we will think about how to produce good quality metadata and offer some advice and tips
To close, we’ll explain how EUDAT uses metadata in the B2FIND service
So let’s begin by thinking about what metadata is.
The quote in the image here that ‘metadata is a love note to the future’ really gets at the meaning.
Metadata is critical to ensure data can be found, understood and reused. If you don’t create metadata, it’s unlikely you will still understand your data in a few years time. The act of creating metadata opens up possibilities for future use.
Metadata is commonly defined as ‘data about data’. By creating metadata, you will ensure that others can find and understand your data.
There are different types of metadata:
Descriptive metadata includes things like the title, author, date, location, coverage and subjects. It’s all the basic information people would need to find the data and understand the main content and context.
Structural metadata explains how data interrelate. For example, if a book has been digitised, you want to understand which set of images form each chapter.
Administrative metadata may include information added by others e.g. preservation metadata added by a repository to note what processes have been performed on the data
There are lots of reasons to create metadata. It:
Facilitates data discovery so others can find your data and your research gains more recognition and impact
Good metadata helps potential users determine whether the data meet their needs, and enables them to interpret and reuse the data
Metadata should outline any limitations and clarify data ownership and restrictions on reuse. This ensures others use the data appropriately
Without metadata, your data will become meaningless over time as others can’t understand and reuse them. Providing associated metadata will give your data permanence and ensure they live on.
By creating high quality metadata and using standards, you can also make your data interoperable
When you create metadata, it is useful to think broadly. You may come across the concept of ‘documentation’ to explain all the details you need to capture and share too.
You should aim to provide all the information a third-party would need to understand and reuse your data. This may include a description of what you did, your workflows, any code created and data dictionaries or clarification of all terms and abbreviations.
This diagram from Bill Michener from DataOne in the United States shows how much information is lost over time.
Metadata is a way to formalise this knowledge so your data retain meaning.
Time really matters when creating metadata – you should create metadata at the time of data creation as information is forgotten quickly. This also gives you an opportunity to do quality control early on.
We have explained why metadata is so important. Let’s now think about how to create good quality metadata.
There are lots of things you can do to improve the quality of your metadata:
Primary among these is to use standards. There are lots out there so look for something relevant to your data type and discipline.
Metadata standards don’t always prescribe how the information should be completed. For this you want to use controlled vocabularies or thesauri for keywords, and recognised ISO standards for common elements like languages or dates
Be consistent in the information provided and ensure the description is appropriate – enough information to avoid being ambiguous but also simple and concise
Any limitations with the dataset should be explained to ensure others reuse it appropriately and don’t make false assumptions
You should avoid using special characters, particularly in file names or column headers in spreadsheets as some software may interpret these symbols as an operator
Also provide persistent identifiers so others can reliably link to and locate your data. This helps with citation and tracking impact too.
Let’s look at some good and bad examples. You can see what we’re looking for in terms of metadata is more precise and standardised entries rather than information that could be ambiguous.
Metres and seconds are universally accepted units of measurement as opposed to furlongs or fortnights
For clarity, provide dates and times in the ISO standard, specifying the timezone
In the third example we can see a properly described variable as opposed to an abbreviation which others may not understand
When stating file formats, it is always useful to specify the version too
The final two examples show the need to be specific so others can understand the coverage properly
It is highly recommended to use metadata standards. It enables interoperability and ensures the information is presented in a predictable way to allow it to be processed by computers.
There are lots of standards that can be used and you can search for them by discipline.
The DCC started a catalogue of disciplinary metadata standards which is now being taken forward as an international initiative via an RDA working group
When choosing a metadata standard, you should consider:
Your data type
What is accepted practice for your field
Whether your organisation or the tools and instruments being used suggest using one format over another (for example if one is recommended or if some metadata is created automatically in a given standard)
Also think about what resources you have available. Some standards have associated tools and more comprehensive instructional materials, so they may be preferred.
When writing metadata, reuse information where possible rather than starting from scratch. For example, you may be able to use a project abstract written for your proposal, information from lab notebooks or citations for data you are reusing.
Where possible, write your metadata using a tool to make the processes easier and more consistent.
Think about metadata creation as an iterative process. It’s best to ask somebody else to read your record to make sure it makes sense to others and then revise and review it again before publishing.
Some general tips to follow include:
Avoid using jargon
Define any abbreviations, acronyms or technical terms
Clearly state limitations and express considerations for appropriate reuse
Use terms like ‘none’ or ‘unknown’ properly
Be comprehensive when writing your dataset title as this is how others will determine whether to look into your data further.
A complete title should explain what the data relates to, a location, time period, subject and scale.
This example illustrates the importance of descriptive titles in metadata records.
The second title gives enough detail for a reader to discern whether they might like more information about your data.
When you are writing your metadata, remember that it will be read by machines as well as people.
You should avoid using symbols that could be misinterpreted, and tabs/indents/breaks that may be stripped out. Using a text editor for copying data will ensure hidden characters and formatting are removed.
The final point to reiterate is the need to review your metadata.
It’s always useful to get a second opinion to make sure others can understand it and feel it’s clear and specific enough.
To close we want to explain how EUDAT is approaching metadata and how services like B2FIND can help you
B2FIND provides a simple and user friendly service that allows the users to discover a wide range of metadata from a variety of research communities. It is based upon a comprehensive metadata catalogue of data collections stored in the EUDAT data centres and harvested from other data repositories.
The B2FIND service helps researchers to find relevant data to reuse, and helps data providers to increase the visibility of their data.
Since EUDAT is a pan-European infrastructure supporting a wide range of disciplines, we have to think about how terms are used differently by different communities.
Chemical species for examples are atoms, molecules, ions etc, whereas for zoologists, ‘species’ denotes different families of animals
The B2FIND service works by harvesting metadata from different communities. This is done on a regular and incremental basis, usually using the OAI-PMH protocol.
The metadata is provided in a range of community standards. It is then processed to transform it to the generic B2FIND schema to allow cross-search.
This is what a record looks like in the B2FIND catalogue. There’s a basic description, a number of keyword tags and some additional information to note the source, creator, language etc
To find out more about B2FIND or use the service, please follow the links provided.