On the provenance of Free and Open
Source Software and the legal
implications of its reuse
based on A Method for Open Source License Compliance of
Java Applications, IEEE Software
May-June 2012 (vol. 29 no. 3)
Daniel M German
Department of Computer Science
University of Victoria
IP is an engineering problem too
● Sure, Intellectual Property is the realm of
● But software engineers have to fix it.
● Open Source
– software licensed under an open source license
Open source LicenseOpen source License
– allows the creation of derivative works
– and their redistribution
As long as some conditions are satisfied
Reuse and Open Source
● FOSS materialized Component-Off-The-Shelf
– Huge pool of components ready to be used
– Free but with a price:
● Comply with the license
FOSS is everywhere today
● Used by both organizations and
– Part of many commercial products
● OS X, Android, many embedded devices
● Created by many commercial companies
– Apple, Google, HP, Ebay, Amazon,
Samsung, IBM, TI, Oracle, etc.
“The way software is built is changing”
Previous Senior Legal Counsel, HP
Software architectures are complex
● Operating systems
Each comes with its own license
Reuse is Easy
● Re-using FOSS is very easy
– Black box:
● reuse as a component
– White box:
● Clone: copy entire product own's code base
● Cut-and-paste: copy snippets
● Most developers don't have training in
● Many think they do but don't
● Most organizations lack policies regarding
use of FOSS
* Sojer and Henkel 2010
Open Source License Compliance
● It is in need of tool support
– Mostly provided by (expensive) organizations
● Blackduck, Palamida, OpenLogic
● Treat everything as Trade Secret
● License Compliance can't trust
– Don't know, forget, ignore, lie ...
The big questions
● Who are you and where
did you come from?
– Provenance discovery
● What role do you play?
– Architectural discovery
● Does your mother know you
– License discovery
Provenance is Complicated
● Was this source file:
– Locally developed?
● If copied:
– What is the source?
● Can we trust the source?
● Measure certain properties of a software system
– Use these properties to create classifications and reduce
– Bertillonage for Java
– Based on Class and Method signatures
– Capable of matching binaries and source
– Open Source (GPLv2+)
The general problem is harder
● Once you know the original code
– What is its license?
● Identify license from source code
● Open source (AGPLv3+)
● Design goals:
– To sacrifice recall for the sake of accuracy
● Rather be safe then wrong
● Support “I don't know”
– To be faster than fossology
– To support the most common licenses, yet be extensible
– To have a very simple “pipe” architecture
● Collection of small tools
● The output of one feeds into the other
Component level composition
● Requires architectural analysis
● How are components connected?
– Type of connection?
● Linking? Dynamic? Static?
● Fork/System exec?
● Web service?
● FOSS reuse is here to stay
● Organizations should be careful on how they
– FOSS License Compliance
● Software is needed to help
● We have implemented a method to help in
license compliance of Java Applications
– Joa: provenance
– Ninka: licensing