Code Provenance Review
The following are proposed code provenance review steps, but are not approved or authoritative in any way.
Goal
To establish a reasonable comfort level that projects going through incubation do not have improperly contributed code, and that the code is all under the project license.
A code provenance review is desirable because it reduces the risk of the foundation, project developers or software users becoming involved in a legal action or having their use of the software disrupted by sudden removal of improperly contributed code. In particular, many enterprises will not build on open source software projects without some degree of assurance that care is being taken to avoid improper contributions.
It is not the goal to be able to prove that every source file, and every contribution to those files, was contributed properly. The onus is not on us to prove there are no problem. However we want to ensure we do not release code with provenance issues that we could have identified and corrected with a reasonable effort.
Library/Component Review
- Prepare a list of any external components that are included "in the source tree" for the project. For instance, GDAL includes a copy of libtiff, libjpeg, etc in the source tree. It is necessary to be able to identify things like that in the source tree that are under their own distinct license, and are not explicited vetted by the project team. If practical it is desirable to remove these internal components and treat them as external depenendencies. If kept internal, notes on the licenses of the components should be included in the provenance review document.
- Prepare a list of external dependencies with with potentially problematic license terms. So, all non-free libraries for instance. All libraries with licenses that might conflict (for instance GDAL's use of GPL'ed GRASS libraries in the non-GPL GDAL).
Code Copyright Review
The objective here is to visit every source file, and identify possible issues, and work to "regularize" things.
- Does the file include the license information? If not, add it if there there is no ambiguity about whether the standard project license applies. If that is not obvious, make notes in the review document.
- Is the file under the normal project license? If not, make notes in the review document.
- Is there anything obviously unusual about the origin of the code? Does this pose any conflicts? Is the issue properly described in the source file? For instance, in GDAL, the gdal/port/cpl_strtod.cpp file is closely derived from external code that was placed in the public domain. cpl_strtod.cpp is placed under the normal GDAL MIT/X license, but detailed notes are kept in the header text on it's origin, the fact that this was public domain and so the fact that we are ok to relicense it. Oddities should be noted in the source file itself and in the review document.
- Maintain a list of all copyright holders identified in the review document. This list is essentially everyone who would need to agree to relicense the project. It may be desirable to seek copyright assignment to a "project lead", or to the foundation to reduce the number of copyright holders for the project though this is not required.
Review Document
The result of the provenance review is two fold. First, there is clarification and "fixes" done during the review. For instance, adding missing copyright notices, or factoring out external libraries. The second is a review report with a fairly detailed list of outstanding issues, ambiguities and information of note.
The review document will be distributed to the project PSC members, as well as the incubation committee. Based on it, the incubation committee may require the project to do additional work, either resolving ambiguities, factoring items out, or rewriting questional components.
When completed, a must briefer form of the review document should be prepared, just listing information that would be pertinent to folks using the project. Essentially a summary. This summary might live in source control as README.LICENSE or something similar.