Compliance automation with Yocto

As with the previous post we’re looking at how to automate license compliance with Yocto. We looked about squeezing information (not only data) out of Yocto we looked at how to get dependency informtion for a package. We tried three different approaches and found that two of them might be useful.

This time we’re following up in the two approaches and will add license information to the JSON files (libcairo.json and cairo__cairo.json).

Getting the license information

I’ve had many discussion with colleagues about where to get the license information. I can’t mention them without asking, but out of the blue I would like to say a big “Thanks for all help! Really appreciated.” There are, as far as I know, a couple of places you can find the license information in Yocto.

Bitbake recipe

Let’s continue with Cairo and have a look at the recipe for that. The recipe is located here: ../meta/recipes-graphics/cairo/cairo_1.16.0.bb (relative path from the build root) and at Yocto’s git repository: cairo_1.16.0.bb

So let’s have a look at some of the gory details:

LICENSE = "(MPL-1.1 | LGPLv2.1) & GPLv3+"
LICENSE_${PN} = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-dev = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-doc = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-gobject = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-script-interpreter = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-perf-utils = "GPLv3+"

At first sight this may look like garbage information, but hey it for sure is not. Yocto has a nice way of making sure that the sub packages (cairo-gobject, cairo-doc…) has their own license information rather than one license per the entire package. This is (I guess) used when Yocto builds. But using this data in reverse, that is from a built binary (or seomthine else) going back to the recipe seems a bit cumbersome. Or?

Let’s say we need to find the license for a library libfreetype.so.6. To find the license through the recipe we need to know:

  • which package it belongs to

  • which sub package of that package

We can do find and grep on packages-split.

$ find tmp/work/core2-64-poky-linux/ -name "libfreetype.so.6" | grep packages-split
tmp/work/core2-64-poky-linux/freetype/2.10.1-r0/packages-split/freetype/usr/lib/libfreetype.so.6

From this we can see that the package is (surprise!) freetype and that the sub package is (also called) freetype. So we need to find a recipe called freetype*.bb.

$ find ../meta* -name "freetype*.bb"
../meta/recipes-graphics/freetype/freetype_2.10.1.bb

In this file we can check out the license information

$ grep LICENSE ../meta/recipes-graphics/freetype/freetype_2.10.1.bb
LICENSE = "FreeType | GPLv2+"
LIC_FILES_CHKSUM = "file://docs/LICENSE.TXT;md5=4af6221506f202774ef74f64932878a1 \

For cairo the coresponding file looks a bit different:

$ grep LICENSE ../meta/recipes-graphics/cairo/cairo_1.16.0.bb 
LICENSE = "(MPL-1.1 | LGPLv2.1) & GPLv3+"
LICENSE_${PN} = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-dev = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-doc = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-gobject = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-script-interpreter = "MPL-1.1 | LGPLv2.1"
LICENSE_${PN}-perf-utils = "GPLv3+"

So for Cairo the license information seem to be different for the various sub packages.

But we can see that it is possible to find the license expression, either directly via the package name or indirectly via package name and sub package name. Some recipes might include other recipes which make this approach a bit tricky, yet doable.

license.manifest

When building Yocto produces a file called license.manifest. Last time I built from scract was (2020-10-24) and my manifest file is located (relative path from the build dir) here: ./tmp/deploy/licenses/core-image-minimal-qemux86-64-20201024110850/license.manifest.

Doing a bit of grep exercise in this file, and wanting only information about the lib which resides in the cairo/cairo package you’ll find. And byt the way let’s some context (three extra lines)

$ grep -A 3 "PACKAGE NAME: cairo$" ./tmp/deploy/licenses/core-image-minimal-qemux86-64-20201024110850/license.manifest
PACKAGE NAME: cairo
PACKAGE VERSION: 1.16.0
RECIPE NAME: cairo
LICENSE: MPL-1.1 | LGPLv2.1

We can extract the last and only relevant (in this use case) line and from that line extract the license expression:

$  grep -A 3 "PACKAGE NAME: cairo$" ./tmp/deploy/licenses/core-image-minimal-qemux86-64-20201024110850/license.manifest | tail -1 | cut -d ":" -f 2
 MPL-1.1 | LGPLv2.1

The licenses are not specified using SPDX, but at least we have the license information. I am currently writing a small tool to translate from Yocto’s license expressions to SPDX. But before I release that I want to make sure there is no other similar tool, which I bet there is.

Anyhow, it is possible and with not so much effort to find the license information in the license.manifest file. How quick is it? That question is answered by looking at:

  1. how quick can we find the license file?

  2. how quick can we find the license information in that file

Both of them should be quick enough. And since it is only one file we could probably read it once and cache it. This touches on a failure I’ve seen in programming education, when using Maps) the students get confronted with the theory but not with a practical use case. I’ve seen teachers who spend time on discussing Maps and Heaps without knowing how to use them. Ahhh, I must not get absorbed by this. Let’s leave education… after all I quit my job as a university teacher.

But there’s a bit of problem. Let’s see what sub packages cairo has.

$ ls -1 tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/packages-split/ | grep -v -e dev -e dbg -e "\-lic" -e shlibdeps -e src
cairo
cairo-doc
cairo-gobject
cairo-locale
cairo-perf-utils
cairo-script-interpreter

Can we find an entry in the license.manifest for each of them?

$ for pkg in $(ls -1 tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/packages-split/ | grep -v -e dev -e dbg -e "\-lic" -e shlibdeps -e src); do echo -n "$pkg: "; grep "PACKAGE NAME: $pkg[ ]*$" tmp/deploy/licenses/core-image-minimal-qemux86-64-20201024110850/license.manifest|wc -l ; done
cairo: 1
cairo-doc: 0
cairo-gobject: 1
cairo-locale: 0
cairo-perf-utils: 0
cairo-script-interpreter: 0

There’s no license information for, among others, cairo-script-interpreter. Does that sub package put something in the system image?

$ tree --charset=ascii  tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/packages-split/cairo-script-interpreter
tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/packages-split/cairo-script-interpreter
`-- usr
    `-- lib
        |-- libcairo-script-interpreter.so.2 -> libcairo-script-interpreter.so.2.11600.0
        `-- libcairo-script-interpreter.so.2.11600.0

2 directories, 2 files

Uh oh. But fear not. This is not put in the image.

Using package-lic

Yocto produces a direcory called ``-lic, e.g. cairo-lic```. In my build I find it like this:

$ find . -name "cairo-lic" -type d | grep packages-split
./tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/packages-split/cairo-lic

Note: greping for packages-split is important. Otherwise we’ll find tons of links and files with the same license information. This way we only get one file. The canonical (in the cairo path) information.

In this directory (```./tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/packages-split/cairo-lic``) we will, somewhere, find a couple of files.

$ find ./tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/packages-split/cairo-lic -type f -exec basename {} \;
generic_MPL-1.1
COPYING
generic_LGPLv2.1
generic_GPLv3

Let’s go through them.

  • COPYING - the same file as the (COPYING file at ciaro git repo)[https://cgit.freedesktop.org/cairo/tree/COPYING]. This is human text and a bit hard to analyse by a program.

  • generic_... - three files ending with a license name (again, not in SPDX format).

So according the COPYING cairo is released under LGPL-2.1 OR MPL-1.1. But according to the genereic license files cairo is released under LGPL-2.1 OR MPL-1.1 AND/OR GPLv3. This is confusing to me:

  • is it AND or OR GPLv3?

  • is it GPLv3 at all?

….. something Wicked This Way Comes :(

Have we stumbled upon a so called envelope license problem. The project’s license file (in the case COPYING) states one thing and some other files state something else. Or have we? Look again.

Now we’re looking at the entire package cairo and not at libcairo (or cairo/cairo) so this information is most likely correct but not correct for the part libcairo (or cairo/cairo).

Using license-destdir/***/recipeinfo

Another file Yocto produces is recipeinfo`` which can be found under the package's license-destdir```. Let’s find the one for cairo and have a look at the content of it:

$ find ./tmp/work/core2-64-poky-linux/cairo/ -type f -name "recipeinfo"
./tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/license-destdir/cairo/recipeinfo
$ cat ./tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/license-destdir/cairo/recipeinfo
LICENSE: (MPL-1.1 | LGPLv2.1) & GPLv3+
PR: r0
PV: 1.16.0

Let’s extract the license:

$ grep "LICENSE:" ./tmp/work/core2-64-poky-linux/cairo/1.16.0-r0/license-destdir/cairo/recipeinfo | cut -d ":" -f 2
 (MPL-1.1 | LGPLv2.1) & GPLv3+

So from this file we get that the license of cairo is: (MPL-1.1 | LGPLv2.1) & GPLv3+. Uh oh… you might think. But look again. Same as before, this license expression applies to the entire package and not to libcairo (cairo/cairo).

What is the license of Cairo

This is something we will look into next post. For now, it suffices to say we’ve concluded that the license of Cairo/libcairo is: MPL-1.1 | LGPLv2.1.

Summarising

With the approaches we’ve found:

Approach License expressions Accuracy Concerns
Bitbake recipe MPL-1.1 | LGPLv2.1 applies to libcairo cumbersome, but works
license.manifest MPL-1.1 | LGPLv2.1 applies to libcairo works
package-lic LGPL-2.1 OR MPL-1.1 AND/OR GPLv3 applies to cairo need to find file
recipeinfo (MPL-1.1 | LGPLv2.1) & GPLv3+ applies to cairo need to find file

So from trying out these approaches we can see that license.manifest contains the licenses, easily read, and the recipe contains the accurate information, a bit harder to read.

Note: I believe I’ve read somewhere that Yocto is able to produce license text in SPDX format. Let’s look into that some other day.

About the cover image

Software license from flickr, (c) 2020 Henrik Sandklef released under Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0)