SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
SOMEF: a metadata extraction framework from software documentation
Presentation done at the council of software registries on March, 2021. SOMEF is a python package for automatically extracting over 25 metadata categories from a readme file. The output is then exported in JSON or in JSON-LD using the codemeta representation
Presentation done at the council of software registries on March, 2021. SOMEF is a python package for automatically extracting over 25 metadata categories from a readme file. The output is then exported in JSON or in JSON-LD using the codemeta representation
SOMEF: a metadata extraction framework from software documentation
1.
SOMEF:
A Metadata Extraction Framework
from Software Documentation
Daniel Garijo
@dgarijov
Ontology Engineering Group,
Universidad Politécnica de Madrid
2.
The Importance of Software Metadata
- An increasing amount of software is available online
- Metadata is critical for:
- Findability
- Reusability
- Understanding
> 100 M repositories
> 28 M repositories
3.
Current Pipeline for Software Metadata Curation
Icons from https://icons8.com/
Develop software &
documentation
Create entry
in registry
Curation and
Validation
7.
Beginning to capture software in context
SOMEF helps automatically extract software metadata. Benefits:
● Let authors know what metadata is missing
● Help making software findable
● Capture context of software (setup, configuration, examples)
But there is much more to explore:
● Better comparison
● Input/output preparation
● Using different software together
Problems? Suggestions? New features?
https://github.com/KnowledgeCaptureAndDiscovery/somef/issues
0 likes
Be the first to like this
Views
Total views
107
On SlideShare
0
From Embeds
0
Number of Embeds
1
You have now unlocked unlimited access to 20M+ documents!
Unlimited Reading
Learn faster and smarter from top experts
Unlimited Downloading
Download to take your learnings offline and on the go
You also get free access to Scribd!
Instant access to millions of ebooks, audiobooks, magazines, podcasts and more.
Read and listen offline with any device.
Free access to premium services like Tuneln, Mubi and more.