SlideShare a Scribd company logo
1 of 32
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project
FOSSology & GSOC Journey
shaheem.azmal@siemens.com <Shaheem Azmal M MD>
mishra.gaurav@siemens.com <Gaurav Mishra>
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 2
Agenda
• FOSSology introduction
• New features since last
year
• FOSSology Scanning in CI
• GSOC
• Conclusion
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 3
The Problem Actually
Distributing open source software requires to
∙ Provide licenses of involved software
∙ Provide copyright statements of involved authors
∙ Provide disclaimers
∙ … and much more
You know these examples
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 4
It is about finding licenses
∙ License texts
∙ References to licenses
∙ Written texts explaining licensing
∙ License relevant statements
Finding Licenses
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 5
What is FOSSology?
A Web server application for license and copyright compliance of software components.
FOSSology Project
https://www.fossology.org/
∙ Published first in 2008, GPL-2.0
∙ 2015: Linux Foundation collaboration project
∙ Web server based and command line
interfaces
∙ Scanning agents searching for license and
copyright relevant hits (and more …)
∙ A multi-user / multi-tenant Web UI for review
organizing clearing job
FOSSology Development
https://www.github.com/fossology/fossology
▪ Standard Web application stack:
▪ Linux, Apache 2, PostgreSQL, PHP,
▪ Web-based UI in PHP, but scanners
written in C / C++
▪ Two ways to interact:
▪ Web user interface
▪ Command line utilities
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 6
How does FOSSology work?
• Uploading source code archive (*.zip, *.tar.gz, etc)
• Agents scan for license relevant text
• Copyrights, Export Control (ECC), your keywords to look for etc.
• Review scanner results for wrong license classification
• Review other scanner findings (copyrights, ECC)
• Result of the “clearing”
• SPDX reporting
• Generated notice or readme file
• debian-copyright
Upload
Component
Agents
Scanning
Review
Results
Generate
Reporting
Pass Report
to Client
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 7
FOSSology Feature Overview
A Web server application for license and copyright compliance of software components.
License Scan features
∙ Regular expression scanner
∙ Text similarity scanner
∙ License (text) management
∙ Aggregation of licenses in hierarchical view
∙ License histogram
∙ Supporting concluded vs. found license
∙ Bulk processing of files with same licensing
∙ Reusing of license conclusions
Other features
▪ Copyright, authorship statements scanner
▪ Export control and customs scanner
▪ Command line interfaces
▪ Reporting
▪ SPDX RDF and tag-value
▪ Debian-copyright
▪ Plain text output
▪ Files sorting in buckets
▪ User, group and upload management
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 8
New Features of FOSSology since last year
▪ Update licenses from SPDX.
▪ Integration with corporate authentication LDAP.
▪ Change permissions for multiple uploads with a single click.
▪ Export all found copyright statements as CSV.
▪ Improvement of analysis report and standalone operations for different agents.
▪ New Agent OJO to scan SPDX-License-Identifier.
▪ Lots of improvements in REST API of FOSSology.
▪ Many More….
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 9
FOSSology Scanning In CI
Power of Open Source, benefits of automation
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 10
WHY?
Is the current way good enough?
Lot’s of code
change
Preparation for
release
Release
Perform license
and copyright
scanning
License
conflict
Go or no go
decision
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 11
New way
Ease the load with automation
New
change
Bug fix
Feature
Continuous
scan
Licenses
Copyrights
Keywords
Release Audit
Smooth
• Easy, lesser
changes
Failure
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 12
Changes required
.gitlab-ci.yml
whitelist.json
Checkout the documentation:
https://github.com/fossology/fossology/wiki/FOSSology-as-CI-scanner
.travis.yml
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 13
Pipeline status
GitLab
License check failure Oll Korrect
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 14
Pipeline status
Travis
License check failure Oll Korrect
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 15
Output
License failure
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 16
Output
Copyright failure
Potential whitelist file
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 17
nomos
• Most trusted scanner in
FOSSology
• Uses regular expression and
heuristics
ojo
• SPDX License Identifier scanner
• Can find licenses attached using
WITH, AND, OR
• Uses regular expressions
• Lightning fast
copyright
• Very low false negative findings
• Can find email and URLs too
• Uses regular expressions
keyword
• Helps in finding potential harmful
keywords like:
• licensed, modify it under, etc.
Scanners availability
Following scanners are shipped with the runner
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 18
Diff scanning
• Default scanning mode
• Scan only the diff created by the
merge request
• Reduced set of data to scan
• Faster feedback at commit level for
developers creating the changes
• Good for build CI pipeline
Repo scan
• Can be used using repo flag
• Scan the complete repo at that
particular commit
• Provides a good overview of the repo
for audit works
• Can be scheduled to run at set
interval crons
• Good for release/tag pipeline
Scanning modes
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 19
licenses
• List of licenses which are
whitelisted
• Each licenses needs to be
explicitly mentioned to avoid false
negative
Whitelisting
exclude
• Files to exclude from scan
• Configuration or test folders
• Understands file glob wild
characters
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 20
Benefits
Time
Frequent
checks
Faster
audit
Faster
release
Less
changes
Lesser
errors
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 21
Sample projects and pipelines
 GitLab:
 https://gitlab.com/GMishx/fossology/-/merge_requests/2/pipelines
 https://gitlab.com/GMishx/fossology/-/merge_requests/3/pipelines
 Travis:
 https://github.com/GMishx/fossology/pulls
 https://travis-ci.com/github/GMishx/fossology/builds/173617637
 https://travis-ci.com/github/GMishx/fossology/builds/173617688
 Pull Request:
 https://github.com/fossology/fossology/pull/1736
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 22
GSoC
is an international annual
program in which Google
awards stipends to students
who successfully complete
a software coding project for an
open source organization
during the summer.
What is Google Summer of code?
Disclaimer: All third party logos and icons referenced by this slide are the property of their respective owners. They are just used to highlight the UI.
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 23
GSoC Timeline
Preparations
Gathering ideas
● Involvement of community.
● Using GitHub issues
Finalizing
Idea selection for GSoC
● Filtering idea
● Moving to Wiki
● Labelling issues
Application
Applying for GSoC
● Preparation of application
● New channel in slack
Proposals
Collaboration on selection
● Gathering proposals
● Slack, email, GitHub issue
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 24
Background Google Summer of Code
For Students
∙ Experience in writing code
∙ Collaborate in OSS project
∙ Work in a distributed environment
∙ Internship experience
∙ Internship stipend by Google
For Mentors
▪ Positive visibility
▪ Meet new students
▪ Extend the OSS community
▪ Experience distributed collaboration
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 25
Aman Jain
Atarashi
Vivek
Spasht
Sandeep
Software
Heritage Agent
Ayush
Atarashi Agent
18 … 19 …
GSoC 2018 & 2019
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 26
Ayush
Code Comment
Kaushlendra
Atarashi
enhancement
Darshan
Dashboard
Project Goals:
Weekly Progress Report:
Milestone achieved:
First Evaluation
Second Evaluation
Third Evaluation
Reporting
20 …
GSoC 2020
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 27
● Student involvement coding
● Bi-weekly meetings for interested students to
discuss the community progress.
How Students are helping us after the GSoC:
● Messenger of FOSSology
● Mentoring interested students
● And continuous collaboration
Post GSoC
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 28
Atarashi
A Step towards non-rule based standalone command line scanner… (https://github.com/fossology/atarashi)
Different
methods for
scanning license
statements
• Unlike rule-based approaches, like Nomos, Atarashi implements
multiple text statistics and information retrieval algorithms.
Distance finding
algorithms
• Word Frequency Similarity
• Term frequency-inverse document frequency (tf-idf)
• Damerau–Levenshtein distance
• N-grams
Similarity finding
algorithms
• Score Similarity
• Cosine Similarity
• Dice Similarity
• Bi-gram Cosine Similarity
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 29
Atarashi: Workflow
Process Input
File
• Extract
comments
• Normalize text
Match SPDX
headers and
SPDX
identifiers
Apply distance
finding
algorithms
Rank results
based on
similarity
Generate
the
output
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 30
Integration with ClearlyDefined (Spasht) and Software Heritage.
Making Conclusion easier
Disclaimer: All third party logos and icons referenced by this slide are the property of their respective owners. They are just used to highlight the UI.
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 31
FOSS-dash helps to extract meaningful data from fossology_DB and exported those metrics to
the time series Influx database. Grafana query tool used to query those metrics and visualized
them with the help of charts and graphs in the Dashboard.
Dashboard
Disclaimer: All third party logos and icons referenced by this slide are the property of their respective owners. They are just used to highlight the UI.
© 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0
The FOSSology Project 32
Thank you for your attention!
© 2016-2020 Siemens AG, The Linux Foundation
CC-BY-SA 4.0
https://creativecommons.org/licenses/by-sa/4.0/
Internet
https://www.fossology.org
GitHub
https://github.com/fossology/fossology
Further Links
https://www.spdx.org
https://www.openchainproject.org
https://github.com/eclipse/sw360
Contact :
FOSSology Mailing list
• fossology@fossology.org
Email us
• shaheem.azmal@siemens.com
• mishra.gaurav@siemens.com

More Related Content

What's hot

Introduction to FOSS
Introduction to FOSSIntroduction to FOSS
Introduction to FOSS
mgamal87
 
Open source Software: pros and cons
Open source Software: pros and consOpen source Software: pros and cons
Open source Software: pros and cons
ygpriya
 

What's hot (20)

Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentation
 
Open Source Software
Open Source SoftwareOpen Source Software
Open Source Software
 
The Role of Collective Management Organizations and the Importance of Good Go...
The Role of Collective Management Organizations and the Importance of Good Go...The Role of Collective Management Organizations and the Importance of Good Go...
The Role of Collective Management Organizations and the Importance of Good Go...
 
Open source business models
Open source business modelsOpen source business models
Open source business models
 
IPR protection to computer softwares
IPR protection to computer softwaresIPR protection to computer softwares
IPR protection to computer softwares
 
Introduction to FOSS
Introduction to FOSSIntroduction to FOSS
Introduction to FOSS
 
An Introduction to Open Source Software and Web Application Development
An Introduction to Open Source Software and Web Application DevelopmentAn Introduction to Open Source Software and Web Application Development
An Introduction to Open Source Software and Web Application Development
 
Introduction To Open Source Licenses
Introduction To Open Source LicensesIntroduction To Open Source Licenses
Introduction To Open Source Licenses
 
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
 
Open Source: What is It?
Open Source: What is It?Open Source: What is It?
Open Source: What is It?
 
Comprendre les licences de logiciels libres
Comprendre les licences de logiciels libresComprendre les licences de logiciels libres
Comprendre les licences de logiciels libres
 
Consulting Services Operation Manual, Asian Development Bank
Consulting Services Operation Manual, Asian Development BankConsulting Services Operation Manual, Asian Development Bank
Consulting Services Operation Manual, Asian Development Bank
 
IPR notes
IPR notesIPR notes
IPR notes
 
Open Source Software
Open Source SoftwareOpen Source Software
Open Source Software
 
An Introduction to Free and Open Source Software Licensing and Business Models
An Introduction to Free and Open Source Software Licensing and Business ModelsAn Introduction to Free and Open Source Software Licensing and Business Models
An Introduction to Free and Open Source Software Licensing and Business Models
 
Open source software
Open source software Open source software
Open source software
 
Copyright in india
Copyright in indiaCopyright in india
Copyright in india
 
Open source Software: pros and cons
Open source Software: pros and consOpen source Software: pros and cons
Open source Software: pros and cons
 
Understanding open source licenses
Understanding open source licensesUnderstanding open source licenses
Understanding open source licenses
 
FOSS
FOSS FOSS
FOSS
 

Similar to FOSSology & GSOC Journey

Similar to FOSSology & GSOC Journey (20)

TeamForge Overview Webinar (8/24)
TeamForge Overview Webinar (8/24)TeamForge Overview Webinar (8/24)
TeamForge Overview Webinar (8/24)
 
Selecting an Open Source License and Business Model for Your Project to Have ...
Selecting an Open Source License and Business Model for Your Project to Have ...Selecting an Open Source License and Business Model for Your Project to Have ...
Selecting an Open Source License and Business Model for Your Project to Have ...
 
ASWF Technical Advisory Council: How to Enable An Open Source Community
ASWF Technical Advisory Council: How to Enable An Open Source CommunityASWF Technical Advisory Council: How to Enable An Open Source Community
ASWF Technical Advisory Council: How to Enable An Open Source Community
 
Using containers and Continuous Packaging to Build native FOSSology packages
Using containers and Continuous Packaging to Build native FOSSology packagesUsing containers and Continuous Packaging to Build native FOSSology packages
Using containers and Continuous Packaging to Build native FOSSology packages
 
Building .NET Microservices
Building .NET MicroservicesBuilding .NET Microservices
Building .NET Microservices
 
TFI2014 Session II - Requirements for SDN - Brian Field
TFI2014 Session II - Requirements for SDN - Brian FieldTFI2014 Session II - Requirements for SDN - Brian Field
TFI2014 Session II - Requirements for SDN - Brian Field
 
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with ConcourseContinuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
Continuous Delivery: Fly the Friendly CI in Pivotal Cloud Foundry with Concourse
 
Apidays Paris 2023 - Managing OpenAPI Documents at Scale, Stéve Sfartz, Cisco
Apidays Paris 2023 - Managing OpenAPI Documents at Scale, Stéve Sfartz, CiscoApidays Paris 2023 - Managing OpenAPI Documents at Scale, Stéve Sfartz, Cisco
Apidays Paris 2023 - Managing OpenAPI Documents at Scale, Stéve Sfartz, Cisco
 
The 12 facets of the OpenAPI standard.pdf
The 12 facets of the OpenAPI standard.pdfThe 12 facets of the OpenAPI standard.pdf
The 12 facets of the OpenAPI standard.pdf
 
Collision 2018: CodeStar for CICD Pipelines
Collision 2018: CodeStar for CICD PipelinesCollision 2018: CodeStar for CICD Pipelines
Collision 2018: CodeStar for CICD Pipelines
 
SpringIO 2016 - Spring Cloud MicroServices, a journey inside a financial entity
SpringIO 2016 - Spring Cloud MicroServices, a journey inside a financial entitySpringIO 2016 - Spring Cloud MicroServices, a journey inside a financial entity
SpringIO 2016 - Spring Cloud MicroServices, a journey inside a financial entity
 
Spring IO 2016 - Spring Cloud Microservices, a journey inside a financial entity
Spring IO 2016 - Spring Cloud Microservices, a journey inside a financial entitySpring IO 2016 - Spring Cloud Microservices, a journey inside a financial entity
Spring IO 2016 - Spring Cloud Microservices, a journey inside a financial entity
 
OpenChain Automotive Work Group Meeting #2 - Lyon
OpenChain Automotive Work Group Meeting #2 - LyonOpenChain Automotive Work Group Meeting #2 - Lyon
OpenChain Automotive Work Group Meeting #2 - Lyon
 
Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...Choisir le bon business model et la bonne licence pour la survie de son proje...
Choisir le bon business model et la bonne licence pour la survie de son proje...
 
DevSecOps - Security in DevOps
DevSecOps - Security in DevOpsDevSecOps - Security in DevOps
DevSecOps - Security in DevOps
 
Choosing the right business model and license - OW2con'19, June 12-13, 2019, ...
Choosing the right business model and license - OW2con'19, June 12-13, 2019, ...Choosing the right business model and license - OW2con'19, June 12-13, 2019, ...
Choosing the right business model and license - OW2con'19, June 12-13, 2019, ...
 
CloudNativeAalborg2023_Jan.pdf
CloudNativeAalborg2023_Jan.pdfCloudNativeAalborg2023_Jan.pdf
CloudNativeAalborg2023_Jan.pdf
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
 
2016 Federal User Group Conference - TeamForge Capabilities and Directions
2016 Federal User Group Conference - TeamForge Capabilities and Directions2016 Federal User Group Conference - TeamForge Capabilities and Directions
2016 Federal User Group Conference - TeamForge Capabilities and Directions
 
Verification at scale: Fitting static code analysis into continuous integration
Verification at scale: Fitting static code analysis into continuous integrationVerification at scale: Fitting static code analysis into continuous integration
Verification at scale: Fitting static code analysis into continuous integration
 

More from Gaurav Mishra

More from Gaurav Mishra (11)

FOSSology and OSS-Tools for License Compliance and Automation
FOSSology and OSS-Tools for License Compliance and AutomationFOSSology and OSS-Tools for License Compliance and Automation
FOSSology and OSS-Tools for License Compliance and Automation
 
Block Chain - Merkel and Key exchange
Block Chain - Merkel and Key exchangeBlock Chain - Merkel and Key exchange
Block Chain - Merkel and Key exchange
 
Block Chain - Introduction
Block Chain - IntroductionBlock Chain - Introduction
Block Chain - Introduction
 
Backup using rsync
Backup using rsyncBackup using rsync
Backup using rsync
 
Disk quota and sysd procd
Disk quota and sysd procdDisk quota and sysd procd
Disk quota and sysd procd
 
Linux User Management
Linux User ManagementLinux User Management
Linux User Management
 
Apache, cron and proxy
Apache, cron and proxyApache, cron and proxy
Apache, cron and proxy
 
Linux Run Level
Linux Run LevelLinux Run Level
Linux Run Level
 
Firewall and IPtables
Firewall and IPtablesFirewall and IPtables
Firewall and IPtables
 
Linux securities
Linux securitiesLinux securities
Linux securities
 
wget, curl and scp
wget, curl and scpwget, curl and scp
wget, curl and scp
 

Recently uploaded

Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Hung Le
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
ZurliaSoop
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
David Celestin
 

Recently uploaded (20)

My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Lions New Portal from Narsimha Raju Dichpally 320D.pptx
Lions New Portal from Narsimha Raju Dichpally 320D.pptxLions New Portal from Narsimha Raju Dichpally 320D.pptx
Lions New Portal from Narsimha Raju Dichpally 320D.pptx
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptxBEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Call Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. Mumbai
Call Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. MumbaiCall Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. Mumbai
Call Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. Mumbai
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.
 
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORNLITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
 
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINESBIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait Cityin kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 
History of Morena Moshoeshoe birth death
History of Morena Moshoeshoe birth deathHistory of Morena Moshoeshoe birth death
History of Morena Moshoeshoe birth death
 

FOSSology & GSOC Journey

  • 1. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project FOSSology & GSOC Journey shaheem.azmal@siemens.com <Shaheem Azmal M MD> mishra.gaurav@siemens.com <Gaurav Mishra>
  • 2. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 2 Agenda • FOSSology introduction • New features since last year • FOSSology Scanning in CI • GSOC • Conclusion
  • 3. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 3 The Problem Actually Distributing open source software requires to ∙ Provide licenses of involved software ∙ Provide copyright statements of involved authors ∙ Provide disclaimers ∙ … and much more You know these examples
  • 4. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 4 It is about finding licenses ∙ License texts ∙ References to licenses ∙ Written texts explaining licensing ∙ License relevant statements Finding Licenses
  • 5. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 5 What is FOSSology? A Web server application for license and copyright compliance of software components. FOSSology Project https://www.fossology.org/ ∙ Published first in 2008, GPL-2.0 ∙ 2015: Linux Foundation collaboration project ∙ Web server based and command line interfaces ∙ Scanning agents searching for license and copyright relevant hits (and more …) ∙ A multi-user / multi-tenant Web UI for review organizing clearing job FOSSology Development https://www.github.com/fossology/fossology ▪ Standard Web application stack: ▪ Linux, Apache 2, PostgreSQL, PHP, ▪ Web-based UI in PHP, but scanners written in C / C++ ▪ Two ways to interact: ▪ Web user interface ▪ Command line utilities
  • 6. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 6 How does FOSSology work? • Uploading source code archive (*.zip, *.tar.gz, etc) • Agents scan for license relevant text • Copyrights, Export Control (ECC), your keywords to look for etc. • Review scanner results for wrong license classification • Review other scanner findings (copyrights, ECC) • Result of the “clearing” • SPDX reporting • Generated notice or readme file • debian-copyright Upload Component Agents Scanning Review Results Generate Reporting Pass Report to Client
  • 7. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 7 FOSSology Feature Overview A Web server application for license and copyright compliance of software components. License Scan features ∙ Regular expression scanner ∙ Text similarity scanner ∙ License (text) management ∙ Aggregation of licenses in hierarchical view ∙ License histogram ∙ Supporting concluded vs. found license ∙ Bulk processing of files with same licensing ∙ Reusing of license conclusions Other features ▪ Copyright, authorship statements scanner ▪ Export control and customs scanner ▪ Command line interfaces ▪ Reporting ▪ SPDX RDF and tag-value ▪ Debian-copyright ▪ Plain text output ▪ Files sorting in buckets ▪ User, group and upload management
  • 8. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 8 New Features of FOSSology since last year ▪ Update licenses from SPDX. ▪ Integration with corporate authentication LDAP. ▪ Change permissions for multiple uploads with a single click. ▪ Export all found copyright statements as CSV. ▪ Improvement of analysis report and standalone operations for different agents. ▪ New Agent OJO to scan SPDX-License-Identifier. ▪ Lots of improvements in REST API of FOSSology. ▪ Many More….
  • 9. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 9 FOSSology Scanning In CI Power of Open Source, benefits of automation
  • 10. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 10 WHY? Is the current way good enough? Lot’s of code change Preparation for release Release Perform license and copyright scanning License conflict Go or no go decision
  • 11. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 11 New way Ease the load with automation New change Bug fix Feature Continuous scan Licenses Copyrights Keywords Release Audit Smooth • Easy, lesser changes Failure
  • 12. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 12 Changes required .gitlab-ci.yml whitelist.json Checkout the documentation: https://github.com/fossology/fossology/wiki/FOSSology-as-CI-scanner .travis.yml
  • 13. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 13 Pipeline status GitLab License check failure Oll Korrect
  • 14. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 14 Pipeline status Travis License check failure Oll Korrect
  • 15. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 15 Output License failure
  • 16. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 16 Output Copyright failure Potential whitelist file
  • 17. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 17 nomos • Most trusted scanner in FOSSology • Uses regular expression and heuristics ojo • SPDX License Identifier scanner • Can find licenses attached using WITH, AND, OR • Uses regular expressions • Lightning fast copyright • Very low false negative findings • Can find email and URLs too • Uses regular expressions keyword • Helps in finding potential harmful keywords like: • licensed, modify it under, etc. Scanners availability Following scanners are shipped with the runner
  • 18. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 18 Diff scanning • Default scanning mode • Scan only the diff created by the merge request • Reduced set of data to scan • Faster feedback at commit level for developers creating the changes • Good for build CI pipeline Repo scan • Can be used using repo flag • Scan the complete repo at that particular commit • Provides a good overview of the repo for audit works • Can be scheduled to run at set interval crons • Good for release/tag pipeline Scanning modes
  • 19. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 19 licenses • List of licenses which are whitelisted • Each licenses needs to be explicitly mentioned to avoid false negative Whitelisting exclude • Files to exclude from scan • Configuration or test folders • Understands file glob wild characters
  • 20. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 20 Benefits Time Frequent checks Faster audit Faster release Less changes Lesser errors
  • 21. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 21 Sample projects and pipelines  GitLab:  https://gitlab.com/GMishx/fossology/-/merge_requests/2/pipelines  https://gitlab.com/GMishx/fossology/-/merge_requests/3/pipelines  Travis:  https://github.com/GMishx/fossology/pulls  https://travis-ci.com/github/GMishx/fossology/builds/173617637  https://travis-ci.com/github/GMishx/fossology/builds/173617688  Pull Request:  https://github.com/fossology/fossology/pull/1736
  • 22. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 22 GSoC is an international annual program in which Google awards stipends to students who successfully complete a software coding project for an open source organization during the summer. What is Google Summer of code? Disclaimer: All third party logos and icons referenced by this slide are the property of their respective owners. They are just used to highlight the UI.
  • 23. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 23 GSoC Timeline Preparations Gathering ideas ● Involvement of community. ● Using GitHub issues Finalizing Idea selection for GSoC ● Filtering idea ● Moving to Wiki ● Labelling issues Application Applying for GSoC ● Preparation of application ● New channel in slack Proposals Collaboration on selection ● Gathering proposals ● Slack, email, GitHub issue
  • 24. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 24 Background Google Summer of Code For Students ∙ Experience in writing code ∙ Collaborate in OSS project ∙ Work in a distributed environment ∙ Internship experience ∙ Internship stipend by Google For Mentors ▪ Positive visibility ▪ Meet new students ▪ Extend the OSS community ▪ Experience distributed collaboration
  • 25. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 25 Aman Jain Atarashi Vivek Spasht Sandeep Software Heritage Agent Ayush Atarashi Agent 18 … 19 … GSoC 2018 & 2019
  • 26. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 26 Ayush Code Comment Kaushlendra Atarashi enhancement Darshan Dashboard Project Goals: Weekly Progress Report: Milestone achieved: First Evaluation Second Evaluation Third Evaluation Reporting 20 … GSoC 2020
  • 27. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 27 ● Student involvement coding ● Bi-weekly meetings for interested students to discuss the community progress. How Students are helping us after the GSoC: ● Messenger of FOSSology ● Mentoring interested students ● And continuous collaboration Post GSoC
  • 28. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 28 Atarashi A Step towards non-rule based standalone command line scanner… (https://github.com/fossology/atarashi) Different methods for scanning license statements • Unlike rule-based approaches, like Nomos, Atarashi implements multiple text statistics and information retrieval algorithms. Distance finding algorithms • Word Frequency Similarity • Term frequency-inverse document frequency (tf-idf) • Damerau–Levenshtein distance • N-grams Similarity finding algorithms • Score Similarity • Cosine Similarity • Dice Similarity • Bi-gram Cosine Similarity
  • 29. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 29 Atarashi: Workflow Process Input File • Extract comments • Normalize text Match SPDX headers and SPDX identifiers Apply distance finding algorithms Rank results based on similarity Generate the output
  • 30. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 30 Integration with ClearlyDefined (Spasht) and Software Heritage. Making Conclusion easier Disclaimer: All third party logos and icons referenced by this slide are the property of their respective owners. They are just used to highlight the UI.
  • 31. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 31 FOSS-dash helps to extract meaningful data from fossology_DB and exported those metrics to the time series Influx database. Grafana query tool used to query those metrics and visualized them with the help of charts and graphs in the Dashboard. Dashboard Disclaimer: All third party logos and icons referenced by this slide are the property of their respective owners. They are just used to highlight the UI.
  • 32. © 2016-2020 Siemens AG, Linux Foundation - CC-BY-SA 4.0 The FOSSology Project 32 Thank you for your attention! © 2016-2020 Siemens AG, The Linux Foundation CC-BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0/ Internet https://www.fossology.org GitHub https://github.com/fossology/fossology Further Links https://www.spdx.org https://www.openchainproject.org https://github.com/eclipse/sw360 Contact : FOSSology Mailing list • fossology@fossology.org Email us • shaheem.azmal@siemens.com • mishra.gaurav@siemens.com

Editor's Notes

  1. Memo
  2. Notes