This document discusses generating linked data descriptions of Debian packages in the Debian Package Tracking System (PTS). It introduces the speaker and their background working on software forges and interoperability. The rest of the document outlines issues with current software project descriptions, describes how the PTS now generates RDF metadata for Debian packages, shows an example of Apache2 packaging in RDF format, discusses the ADMS.SW ontology used to model the data, and explores potential uses of the linked data including tracing packages across the FLOSS ecosystem and matching packages to their upstream projects.
Big Data Step-by-Step: Infrastructure 1/3: Local VMJeffrey Breen
Part 1 of 3 of series focusing on the infrastructure aspect of getting started with Big Data, specifically Hadoop. This presentation starts small, installing a pre-packaged virtual machine from Hadoop vendor Cloudera on your local machine.
We then install R, copy some sample data into HDFS and test everything by running Jonathan Seidman's a sample streaming job.
Presented at the Boston Predictive Analytics Big Data Workshop, March 10, 2012
Conda is a cross-platform package manager that lets you quickly and easily build environments containing complicated software stacks. It was built to manage the NumPy stack in Python but can be used to manage any complex software dependencies.
Big Data Step-by-Step: Infrastructure 1/3: Local VMJeffrey Breen
Part 1 of 3 of series focusing on the infrastructure aspect of getting started with Big Data, specifically Hadoop. This presentation starts small, installing a pre-packaged virtual machine from Hadoop vendor Cloudera on your local machine.
We then install R, copy some sample data into HDFS and test everything by running Jonathan Seidman's a sample streaming job.
Presented at the Boston Predictive Analytics Big Data Workshop, March 10, 2012
Conda is a cross-platform package manager that lets you quickly and easily build environments containing complicated software stacks. It was built to manage the NumPy stack in Python but can be used to manage any complex software dependencies.
Slides from my lightning talk at the Boston Predictive Analytics Meetup hosted at Predictive Analytics World, Boston, October 1, 2012.
Full code and data are available on github: http://bit.ly/pawdata
A story of how we went about packaging perl and all of the dependencies that our project has.
Where we were before, the chosen path, and the end result.
The pitfalls and a view on the pros and cons of the previous state of affairs versus the pros/cons of the end result.
FLOW3 is a web application platform which uses Domain-Driven Design as its major underlying concept. This approach makes FLOW3 easy to learn and at the same time clean and flexible for complex projects. It features namespaces, has an emphasis on clean, object-oriented code and provides a seemless Doctrine 2 integration.
FLOW3 incorporates Dependency Injection in a way which lets you truly enjoy creating a stable and easy-to-test application architecture (no configuration necessary). Being the only Aspect-Oriented Programming capable PHP framework, FLOW3 allows you to cleanly separate cross-cutting concerns like security from your main application logic.
This tutorial takes you through an imaginary project from scratch. During the journey we’ll visit all important areas of the framework.
Theory and OpenLDAP implementation
Ce support est diffusé sous licence Creative Commons (CC BY-SA 3.0 FR)
Attribution - Partage dans les Mêmes Conditions 3.0 France
En savoir plus sur www.opensourceschool.fr
Plan :
1. Introduction
2. Anatomy of a LDAP directory
3. OpenLDAP: A LDAP implementation
4. Lab : Install an OpenLDAP server
5. Working with LDAP servers
6. Extending LDAP
HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017Codemotion
We've all heard about HTTP/2, but what's in it for us? Is it really that much better? How can we start using it? During this talk, we will explore HTTP/2's new features while creating our own web server, demonstrating new features like server push, multiplexing and header compression. At then end, we can proof how HTTP/2 benefits not only the end user, but developers and operations as well!.
PuppetConf 2016: Enjoying the Journey from Puppet 3.x to 4.x – Rob Nelson, AT&T Puppet
Here are the slides from Rob Nelson's PuppetConf 2016 presentation called Enjoying the Journey from Puppet 3.x to 4.x. Watch the videos at https://www.youtube.com/playlist?list=PLV86BgbREluVjwwt-9UL8u2Uy8xnzpIqa
Integrating R & Hadoop - Text Mining & Sentiment AnalysisAravind Babu
This project encompasses the sentiment expressed in social media (Twitter) for smartphones.
How to Perform text mining on Hadoop data and analyze the same by integrating R with Hadoop.
Slides from my lightning talk at the Boston Predictive Analytics Meetup hosted at Predictive Analytics World, Boston, October 1, 2012.
Full code and data are available on github: http://bit.ly/pawdata
A story of how we went about packaging perl and all of the dependencies that our project has.
Where we were before, the chosen path, and the end result.
The pitfalls and a view on the pros and cons of the previous state of affairs versus the pros/cons of the end result.
FLOW3 is a web application platform which uses Domain-Driven Design as its major underlying concept. This approach makes FLOW3 easy to learn and at the same time clean and flexible for complex projects. It features namespaces, has an emphasis on clean, object-oriented code and provides a seemless Doctrine 2 integration.
FLOW3 incorporates Dependency Injection in a way which lets you truly enjoy creating a stable and easy-to-test application architecture (no configuration necessary). Being the only Aspect-Oriented Programming capable PHP framework, FLOW3 allows you to cleanly separate cross-cutting concerns like security from your main application logic.
This tutorial takes you through an imaginary project from scratch. During the journey we’ll visit all important areas of the framework.
Theory and OpenLDAP implementation
Ce support est diffusé sous licence Creative Commons (CC BY-SA 3.0 FR)
Attribution - Partage dans les Mêmes Conditions 3.0 France
En savoir plus sur www.opensourceschool.fr
Plan :
1. Introduction
2. Anatomy of a LDAP directory
3. OpenLDAP: A LDAP implementation
4. Lab : Install an OpenLDAP server
5. Working with LDAP servers
6. Extending LDAP
HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017Codemotion
We've all heard about HTTP/2, but what's in it for us? Is it really that much better? How can we start using it? During this talk, we will explore HTTP/2's new features while creating our own web server, demonstrating new features like server push, multiplexing and header compression. At then end, we can proof how HTTP/2 benefits not only the end user, but developers and operations as well!.
PuppetConf 2016: Enjoying the Journey from Puppet 3.x to 4.x – Rob Nelson, AT&T Puppet
Here are the slides from Rob Nelson's PuppetConf 2016 presentation called Enjoying the Journey from Puppet 3.x to 4.x. Watch the videos at https://www.youtube.com/playlist?list=PLV86BgbREluVjwwt-9UL8u2Uy8xnzpIqa
Integrating R & Hadoop - Text Mining & Sentiment AnalysisAravind Babu
This project encompasses the sentiment expressed in social media (Twitter) for smartphones.
How to Perform text mining on Hadoop data and analyze the same by integrating R with Hadoop.
My talk at August's joint meeting of Chicago's R and Hadoop user groups providing an introduction to using R with Hadoop. It starts with a quick introduction to and overview of available options, then focuses on using RHadoop's rmr library to perform an analysis on the publicly-available 'airline' data set.
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...Alexander Dean
Hadoop is everywhere these days, but it can seem like a complex, intimidating ecosystem to those who have yet to jump in.
In this hands-on workshop, Alex Dean, co-founder of Snowplow Analytics, will take you "from zero to Hadoop", showing you how to run a variety of simple (but powerful) Hadoop jobs on Elastic MapReduce, Amazon's hosted Hadoop service. Alex will start with a no-nonsense overview of what Hadoop is, explaining its strengths and weaknesses and why it's such a powerful platform for data warehouse practitioners. Then Alex will help get you setup with EMR and Amazon S3, before leading you through a very simple job in Pig, a simple language for writing Hadoop jobs. After this we will move onto writing a more advanced job in Scalding, Twitter's Scala API for writing Hadoop jobs. For our final job, we will consolidate everything we have learnt by building a more sophisticated job in Scalding.
The title "Big Data using Hadoop.pdf" suggests that the document is likely a PDF file that focuses on the utilization of Hadoop technology in the context of Big Data. Hadoop is a popular open-source framework for distributed storage and processing of large datasets. The document is expected to cover various aspects of working with big data, emphasizing the role of Hadoop in managing and analyzing vast amounts of information.
Containers Roadshow: How to Develop Containers for the EnterpriseHonza Horák
Introduction of how we do Docker containers development in Red Hat. The talk shares some best practices and gotchas that we realized during databases packaging as linux containers.
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
This was a talk that I gave at CERN at the Inter-experimental Machine Learning (IML) Working Group Meeting in April 2017 about language-agnostic (or polyglot) analysis workflows. I show how it is possible to work in multiple languages and switch between them without leaving the workflow you started. Additionally, I demonstrate how an entire workflow can be encapsulated in a markdown file that is rendered to a publishable paper with cross-references and a bibliography (and with raw LaTeX file produced as a by-product) in a simple process, making the whole analysis workflow reproducible. For experimental particle physics, ROOT is the ubiquitous data analysis tool, and has been for the last 20 years old, so I also talk about how to exchange data to and from ROOT.
Quadrupling your elephants - RDF and the Hadoop ecosystemRob Vesse
Presentation given at ApacheCon EU 2014 in Budapest on technologies aiming to bridge the gap between the RDF and the Hadoop ecosystems.
Talks primarily about RDF Tools for Hadoop (part of the Apache Jena) project and Intel Graph Builder (extensions to Pig)
Design and Research of Hadoop Distributed Cluster Based on RaspberryIJRESJOURNAL
ABSTRACT : Based on the cost saving, this Hadoop distributed cluster based on raspberry is designed for the storage and processing of massive data. This paper expounds the two core technologies in the Hadoop software framework - HDFS distributed file system architecture and MapReduce distributed processing mechanism. The construction method of the cluster is described in detail, and the Hadoop distributed cluster platform is successfully constructed based on the two raspberry factions. The technical knowledge about Hadoop is well understood in theory and practice.
Bug tracking à grande échelle et interopérabilité des outils de développement...olberger
L’écosystème du logiciel libre (FLOSS) est caractérisé par un développement extrèmement décentralisé, avec de multiples canaux de production et de distribution décorellés, et des processus d’assurance qualité qui doivent donc prendre en compte ces aspects.
Dans cet ensemble de processus d’Assurance Qualité, nous détailerons le volet du suivi des rapports de bugs, en présentant quelques pistes de standardisation et des mécanismes d’interopérabilité (comme le standard OSLC).
Il reste encore de nombreux efforts d’implémentation à conduire, mais avec un espoir concret à lé clé de permettre la réalisation de nouveaux outils, basés sur l’approche Linked Data, permettant un suivi des rapports de bugs à grande échelle.
Visualizing contributions in a forge -Case study on PicoForgeolberger
We advocate the use of RDF to plug process analysis tools to the forges, allowing the transfer of semantic information, and the integration of visualization in the forges to feedback to open-source developpers
Plate-formes pour le développement collaboratif des logiciels libresolberger
Réflexions autour des plate-formes pour le développement collaboratif des logiciels libres (« forges » logicielles) - 4ème GNU/Linux days Agadir (Maroc)
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Essentials of Automations: The Art of Triggers and Actions in FME
Generating Linked Data descriptions of Debian packages in the Debian PTS
1. Intoduction The Debian PTS Meta-data about source packages Applications
Generating Linked Data descriptions of Debian
packages in the Debian PTS
Olivier Berger <mailto:obergix@debian.org>,
Debian + Télécom SudParis
Sunday 25/11/2012
Debian mini-debconf - Paris
2. Intoduction The Debian PTS Meta-data about source packages Applications
Quick Introduction
Short bio
Olivier BERGER
<mailto:olivier.berger@telecom-sudparis.eu>
<mailto:obergix@debian.org>
Research Engineer at TELECOM SudParis, expert on software
development forges, and interoperability in Libre Software
development projects. Contributor to FusionForge, Debian, etc.
3. Intoduction The Debian PTS Meta-data about source packages Applications
Issues
• A universal Free Software Project description format ;
• A universal distribution Package description format ;
• How to inter-link documents about the same program :
• Bug reports (upstream and in all distros)
• Security advisories
• Are we developing Free Software in close silos ?
• What’s usually wrong with [ XML | JSON | YAML | RFC822 ]
format ?
4. Intoduction The Debian PTS Meta-data about source packages Applications
Linked Open Debian Development Meta-Data
Let’s try and make as much Debian facts as possible available to
humans + machines ?
Adopting the 5 Open Data principles using RDF for Debian
meta-data :
make your stuff available on the web
make it available as structured data
non-proprietary format
use URLs to identify things, so that people can point
at your stuff (RDF)
link your data to other people’s data to provide context
(Linked RDF)
Aligns quite well with 3. We will not hide problems (Debian social
contract)
5. Intoduction The Debian PTS Meta-data about source packages Applications
What’s the Debian PTS
http ://packages.qa.debian.org/
6. Intoduction The Debian PTS Meta-data about source packages Applications
Architecture of the PTS
The PTS used to be for humans
Improved from source diagram : Package Tracking System Hacking’s guide by Raphaël Hertzog
7. Intoduction The Debian PTS Meta-data about source packages Applications
Producing HTML for humans and RDF for machines
NEW : the PTS now generates RDF meta-data to be consumed by
machines (+/- 1.5 M triples)
We use content-negociation to redirect to the proper document.
8. Intoduction The Debian PTS Meta-data about source packages Applications
Model : Graph of RDF resources
Reference Linked Data resources with canonical URI like
<http://packages.qa.debian.org/PACKAGE#RESOURCE_ID>
The greyed resources correspond to upstream components
9. Intoduction The Debian PTS Meta-data about source packages Applications
Apache2 packaging as RDF (Turtle notation)
http://packages.qa.debian.org/a/apache2.ttl (soon)
@ p r e f i x d o a p : < h t t p : // u s e f u l i n c . com/ n s / doap# .
>
@ p r e f i x a d m s s w : < h t t p : // p u r l . o r g / adms / sw /> .
< h t t p : // p a c k a g e s . qa . d e b i a n . o r g / a p a c h e 2#p r o j e c t>
a admssw:SoftwareProject ;
doap:name " apache2 " ;
d o a p : d e s c r i p t i o n " Debian ␣ apache2 ␣ s o u r c e ␣ packaging " ;
d o a p : h o m e p a g e < h t t p : // p a c k a g e s . d e b i a n . o r g / s r c : a p a c h e 2> ;
d o a p : h o m e p a g e < h t t p : // p a c k a g e s . qa . d e b i a n . o r g / a p a c h e 2> ;
d o a p : r e l e a s e < h t t p : // p a c k a g e s . qa . d e b i a n . o r g / a p a c h e 2#apache2_2
. 2 . 2 2 − 1 1> ;
schema:contributor [
a foaf:OnlineAccount ;
f o a f : a c c o u n t N a m e " D e b i a n ␣ Apache ␣ M a i n t a i n e r s " ;
f o a f : a c c o u n t S e r v i c e H o m e p a g e < h t t p : // qa . d e b i a n . o r g / d e v e l o p e r . php ?
l o g i n=d e b i a n −a p a c h e @ l i s t s . d e b i a n . o r g>
] .
< h t t p : // p a c k a g e s . qa . d e b i a n . o r g / a p a c h e 2#apache2_2 . 2 . 2 2 − 1 1>
a admssw:SoftwareRelease ;
r d f s : l a b e l " apache2 ␣ 2.2.22 −11 " ;
d o a p : r e v i s i o n " 2.2.22 −11 " ;
a d m s s w : p a c k a g e < h t t p : // p a c k a g e s . qa . d e b i a n . o r g / a p a c h e 2#apache2_2
. 2 . 2 2 − 1 1 . d s c> ;
a d m s s w : i n c l u d e d A s s e t < h t t p : // p a c k a g e s . qa . d e b i a n . o r g / a p a c h e 2#
u p s t r e a m s r c _ 2 . 2 . 2 2> ;
a d m s s w : i n c l u d e d A s s e t < h t t p : // p a c k a g e s . qa . d e b i a n . o r g / a p a c h e 2#
d e b i a n s r c _ 2 . 2 . 2 2 − 1 1> .
10. Intoduction The Debian PTS Meta-data about source packages Applications
The ADMS.SW ontology
Asset Description Metadata Schema for Software (ADMS.SW)
• Pilot : EC / Interoperability Solutions for
European Public Administrations (ISA) -
cf. Joinup site
• Exchanging project / packages / releases
descriptions across development platforms
and directories
11. Intoduction The Debian PTS Meta-data about source packages Applications
Open specifications
Not too much NIH syndrom : reuses :
• ADMS / RADion (generic meta-data for semantic assets
indexing)
• DOAP (Description of a project)
• SPDX™ ( Software Package Data Exchange ®)
• W3C Government Linked Data (GLD) Working Group
Version 1.0 issued 2012/06/29
RDF Validator issued a few weeks ago
12. Intoduction The Debian PTS Meta-data about source packages Applications
ADMS.SW main concepts
Project (= Program), Release, Package
Modeling Debian will require other complements
13. Intoduction The Debian PTS Meta-data about source packages Applications
Traceability over the FLOSS ecosystem
Collecting descriptions of packages/projects from their source
projects as Linked Data (DOAP or ADMS.SW) into an RDF triple
store.
For instance :
• For debian, from PTS ADMS.SW full dump from
p.q.d.o:/srv/packages.qa.debian.org/www/web/full-dump.rdf
• For apache, from DOAP files (see
http ://projects.apache.org/docs/index.htm)
• . . . Add your preferred upstream . . .
Following links between bugs, packages, security alerts, all in
semantic interoperable meta-data formats !
14. Intoduction The Debian PTS Meta-data about source packages Applications
Matching project/package descriptions
Example (SPARQL query to match packages by their
homepages)
PREFIX doap : <h t t p : / / u s e f u l i n c . com/ n s / doap>
SELECT ∗ WHERE
{
GRAPH <h t t p : / / p a c k a g e s . qa . d e b i a n . o r g />
{
? dp doap : homepage ? h
}
GRAPH <h t t p : / / p r o j e c t s . a p a c h e . o r g />
{
? ap doap : homepage ? h
}
}
“Semantic query” : Trying to match source packages in Debian
whose upstream project’s homepages match those of the Apache
projec’s DOAP descriptors.
15. Intoduction The Debian PTS Meta-data about source packages Applications
Matching packages
Results : 62 matching Apache projects packaged in Debian (for
which maintainers did set the Homepage Control field consistently).
Example (Matching upstream Apache project homepages with
Debian source packages’)
dp h ap
ivy ant.a.o/ivy/ ant.a.o/ivy/
apr apr.a.o/ apr.a.o/
apr-util apr.a.o/ apr.a.o/
libcommons-cli-java commons.a.o/cli/ commons.a.o/cli/
libcommons-codec-java commons.a.o/codec/ commons.a.o/codec/
libcommons-collections3-java commons.a.o/collections/ commons.a.o/collections/
libcommons-collections-java commons.a.o/collections/ commons.a.o/collections/
commons-daemon commons.a.o/daemon/ commons.a.o/daemon/
libcommons-discovery-java commons.a.o/discovery/ commons.a.o/discovery/
libcommons-el-java commons.a.o/el/ commons.a.o/el/
libcommons-fileupload-java commons.a.o/fileupload/ commons.a.o/fileupload/
commons-io commons.a.o/io/ commons.a.o/io/
commons-jci commons.a.o/jci/ commons.a.o/jci/
libcommons-launcher-java commons.a.o/launcher/ commons.a.o/launcher/
... ... ...
16. Intoduction The Debian PTS Meta-data about source packages Applications
Related initiatives
About ADMS.SW :
• Adding ADMS.SW support in FusionForge (hence soon in
Alioth)
Related initiatives about package meta-data :
• AppStream (Software Center, DEP11, etc.)
• Umegaya (upstream meta-data, links with research
publications, etc.)
• DistroMatch (match package names across distributions)
Let’s discuss further steps (replacing YAML by Turtle, etc. ;-)
17. Intoduction The Debian PTS Meta-data about source packages Applications
Fin
More details at :
• http ://wiki.debian.org/qa.debian.org/pts/RdfInterface
• Linked Data descriptions of Debian source packages using
ADMS.SW to be published soon in SWESE 2012 proceedings
Contact :
Micro-blogging : @oberger http://identi.ca/oberger/
Email : mailto:obergix@debian.org
Blog : http://www-public.telecom-sudparis.eu/~berger_o/weblog/
Copyright :
Copyright 2012 Institut Mines Telecom + Olivier Berger
License of this presentation : Creative Commons Share Alike (except
illustrations which are under copyright of their respective owners)