Post-Scan Options

Post-Scan options activate their respective post-scan plugins which execute the task.

All “Post-Scan” Options

--mark-source

Set the “is_source” flag to true for directories that contain over 90% of source files as direct children and descendants. Count the number of source files in a directory as a new “source_file_counts” attribute

Sub-Option of - --url

--consolidate

Group resources by Packages or license and copyright holder and return those groupings as a list of consolidated packages and a list of consolidated components.

Sub-Option of - --copyright, --license and --packages.

--filter-clues

Filter redundant duplicated clues already contained in detected licenses, copyright texts and notices.

--is-license-text

Set the “is_license_text” flag to true for files that contain mostly license texts and notices (e.g. over 90% of the content).

Sub-Option of - --info and --license-text.

Warning

--is-license-text is an experimental Option.

--license-clarity-score

Compute a summary license clarity score at the codebase level.

Sub-Option of - --classify.

--license-policy FILE

Load a License Policy file and apply it to the scan at the Resource level.

--summary

Summarize license, copyright and other scans at the codebase level.

Sub-Options:

  • --summary-by-facet

  • --summary-key-files

  • --summary-with-details

--summary-by-facet

Summarize license, copyright and other scans and group the results by facet.

Sub-Option of - --summary and --facet.

--summary-key-files

Summarize license, copyright and other scans for key, top-level files. Key files are top- level codebase files such as COPYING, README and package manifests as reported by the --classify option “is_legal”, “is_readme”, “is_manifest” and “is_top_level” flags.

Sub-Option of - --classify and --summary.

--summary-with-details

Summarize license, copyright and other scans at the codebase level, keeping intermediate details at the file and directory level.


Note

Please note that different for different install methods, scan commands vary.

The basic command to perform a scan, in case of a download and configure installation (on Linux/MacOS) is:

path/to/scancode [OPTIONS] <OUTPUT FORMAT OPTION(s)> <SCAN INPUT>

The basic usage, if Scancode is installed from pip, or in Windows:

scancode [OPTIONS] <OUTPUT FORMAT OPTION(s)> <SCAN INPUT>

For more information on how the Scan Command varies for Various Installation Methods/Operating Systems, refer Commands Variation.

To see all plugins available via command line help, use --plugins.

Note

Plugins that are shown by using --plugins inlcude the following:

  1. Post-Scan Plugins (and, the following)

  2. Pre-Scan Plugins

  3. Output Options

  4. Output Control

  5. Basic Scan Options


--mark-source Option

Dependency

The option --mark-source is a sub-option of and requires the option --info.

The mark-source option marks the “is_source” attribute of a directory to be “True”, if more than 90% of the files under that directory is source files, i.e. Their “is_source” attribute is “True”.

When the following command is executed to scan the samples directory with this option enabled:

scancode -clpieu --json-pp output.json samples --mark-source

Then, the following directories are marked as “Source”, i.e. Their “is_source” attribute is changed from “false” to “True”.

  • samples/JGroups/src

  • samples/zlib/iostream2

  • samples/zlib/gcc_gvmat64

  • samples/zlib/ada

  • samples/zlib/infback9


--consolidate Option

Dependency

The option --consolidate is a sub-option of and requires the options --license , --copyright and --package.

The JSON file containing scan results after using the --consolidate Plugin is structured as follows: (note: “…” in the image contains more data)

An example Scan:

scancode -clpieu --json-pp output.json samples --consolidate

The JSON output file is structured as follows:

{
  "headers": [
    {...}
  ],
  "consolidated_components": [
    {...
    },
    {
      "type": "license-holders",
      "identifier": "dmitriy_anisimkov_1",
      "consolidated_license_expression": "gpl-2.0-plus WITH ada-linking-exception",
      "consolidated_holders": [
        "Dmitriy Anisimkov"
      ],
      "consolidated_copyright": "Copyright (c) Dmitriy Anisimkov",
      "core_license_expression": "gpl-2.0-plus WITH ada-linking-exception",
      "core_holders": [
        "Dmitriy Anisimkov"
      ],
      "other_license_expression": null,
      "other_holders": [],
      "files_count": 1
    },
    {...
    }
  ],
  "consolidated_packages": [],
  "files": [
  ]
}

Each consolidated component has the following information:

"consolidated_components": [
{
  "type": "license-holders",
  "identifier": "dmitriy_anisimkov_1",
  "consolidated_license_expression": "gpl-2.0-plus WITH ada-linking-exception",
  "consolidated_holders": [
    "Dmitriy Anisimkov"
  ],
  "consolidated_copyright": "Copyright (c) Dmitriy Anisimkov",
  "core_license_expression": "gpl-2.0-plus WITH ada-linking-exception",
  "core_holders": [
    "Dmitriy Anisimkov"
  ],
  "other_license_expression": null,
  "other_holders": [],
  "files_count": 1
},

In addition to this, in every file/directory where the consolidated part (i.e. License information) was present, a “consolidated_to” attribute is added pointing to the “identifier” of “consolidated_components”:

"consolidated_to": [
         "dmitriy_anisimkov_1"
         ],

Note that multiple files may have the same “consolidated_to” attribute.


--filter-clues Option

The --filter-clues Plugin filters redundant duplicated clues already contained in detected licenses, copyright texts and notices.

Warning

Running the following scan generates an error:

./scancode -clp --json-pp sample_filter_clues.json samples --filter-clues

--is-license-text Option

Dependency

The option --is-license-text is a sub-option of and requires the options --info and --license-text. Also, the option --license-text is a sub-option of and requires the options --license.

If the --is-license-text is used, then the “is_license_text” flag is set to true for files that contain mostly license texts and notices. Here mostly means over 90% of the content of the file.

An example Scan:

scancode -clpieu --json-pp output.json samples --license-text --is-license-text

If the samples directory is scanned with this plugin, the files containing mostly license texts will have the following attribute set to ‘true’:

"is_license_text": true,

The files in samples that will have the “is_license_text” to be true are:

samples/JGroups/EULA
samples/JGroups/LICENSE
samples/JGroups/licenses/apache-1.1.txt
samples/JGroups/licenses/apache-2.0.txt
samples/JGroups/licenses/bouncycastle.txt
samples/JGroups/licenses/cpl-1.0.txt
samples/JGroups/licenses/lgpl.txt
samples/zlib/dotzlib/LICENSE_1_0.txt

Note that the license objects for each detected license in the files already has “is_license_text” attributes by default, but not the file objects. They only have this attribute if the plugin is used.

Warning

--is-license-text is an experimental Option.


--license-clarity-score Option

Dependency

The option --license-clarity-score is a sub-option of and requires the option --classify.

The --license-clarity-score plugin when used in a scan, computes a summary license clarity score at the codebase level.

An example Scan:

scancode -clpieu --json-pp output.json samples --classify --license-clarity-score

The “license_clarity_score” will have the following attributes:

  • “score”

  • “declared”

  • “discovered”

  • “consistency”

  • “spdx”

  • “license_texts”

It whole JSON file is structured as follows, when it has “license_clarity_score”:

{
  "headers": [
    { ...
    }
  ],
  "license_clarity_score": {
    "score": 17,
    "declared": false,
    "discovered": 0.69,
    "consistency": false,
    "spdx": false,
    "license_texts": false
  },
  "files": [
  ...
  ]
}

--license-policy FILE Option

The Policy file is a YAML (.yml) document with the following structure:

license_policies:
-   license_key: mit
    label: Approved License
    color_code: '#00800'
    icon: icon-ok-circle
-   license_key: agpl-3.0
    label: Approved License
    color_code: '#008000'
    icon: icon-ok-circle

Note

In the policy file only the “license_key” is a required field.

Applying License Policies during a ScanCode scan, using the --license-policy Plugin:

scancode -clipeu --json-pp output.json samples --license-policy policy-file.yml

Note

--license-policy FILE is a not a sub-option of --license. It works normally without -l.

This adds to every file/directory an object “license_policy”, having as further attributes under it the fields as specified in the .YAML file. Here according to our example .YAML file, the attributes will be:

  • “license_key”

  • “label”

  • “color_code”

  • “icon”

Here the samples directory is scanned, and the Scan Results for a sample file is as follows:

{
  "path": "samples/JGroups/licenses/apache-2.0.txt",
  ...
  ...
  ...
  "licenses": [
  ...
  ...
  ...
  ],
  "license_expressions": [
    "apache-2.0"
  ],
  "copyrights": [],
  "holders": [],
  "authors": [],
  "packages": [],
  "emails": [],
  "license_policy": {
    "license_key": "apache-2.0",
    "label": "Approved License",
    "color_code": "#008000",
    "icon": "icon-ok-circle"
  },
  "urls": [],
  "files_count": 0,
  "dirs_count": 0,
  "size_count": 0,
  "scan_errors": []
},

More information on the License Policy Plugin and usage.


--summary Option

Sub-Option

The option --summary-by-facet, --summary-key-files and --summary-with-details``are sub-options of ``--summary. These Sub-Options are all Post-Scan Options.

An example Scan:

scancode -clpieu --json-pp output.json samples --summary

The whole JSON file is structured as follows, when the --summary plugin is applied:

{
  "headers": [
    {
    ...
    }
  ],
  "summary": {
    "license_expressions": [ ...
    ],
    "copyrights": [ ...
    ],
    "holders": [ ...
    ],
    "authors": [ ...
    ],
    "programming_language": [ ...
    ],
    "packages": []
  },
  "files": [ ...
  ]
}

The Summary object has the following attributes.

  • “license_expressions”

  • “copyrights”

  • “holders”

  • “authors”

  • “programming_language”

  • “packages”

Each attribute has multiple entries each containing “value” and “count”, with their values having the summary information inside them.

A sample summary object generated:

"summary": {
"license_expressions": [
  {
    "value": "zlib",
    "count": 13
  },
]
],
"copyrights": [
  {
    "value": "Copyright (c) Mark Adler",
    "count": 4
  },
  {
    "value": "Copyright (c) Free Software Foundation, Inc.",
    "count": 2
  },
  {
    "value": "Copyright (c) The Apache Software Foundation",
    "count": 1
  },
  {
    "value": "Copyright Red Hat, Inc. and individual contributors",
    "count": 1
  }
],
"holders": [
  {
    "value": null,
    "count": 10
  },
  {
    "value": "Mark Adler",
    "count": 4
  },
  {
    "value": "Red Hat, Inc. and individual contributors",
    "count": 1
  },
  {
    "value": "The Apache Software Foundation",
    "count": 1
  },
],
"authors": [
  {
    "value": "Bela Ban",
    "count": 4
  },
  {
    "value": "Brian Stansberry",
    "count": 1
  },
  {
    "value": "the Apache Software Foundation (http://www.apache.org/)",
    "count": 1
  }
],
"programming_language": [
  {
    "value": "C++",
    "count": 13
  },
  {
    "value": "Java",
    "count": 7
  },
],
"packages": []

--summary-by-facet Option

Dependency

The option --summary-by-facet is a sub-option of and requires the options --facet and --summary.

Running the scan with --summary --summary-by-facet Plugins creates individual summaries for all the facets with the same license, copyright and other scan information, at a codebase level (in addition to the codebase level general summary generated by --summary Plugin)

An example scan using the --summary-by-facet Plugin:

scancode -clieu --json-pp output.json samples --summary --facet dev="*.java" --facet dev="*.c" --summary-by-facet

Note

All other files which are not dev are marked to be included in the facet core.

Warning

Running the same scan with ./scancode -clpieu i.e. with -p generates an error. Avoid this.

The JSON file containing scan results is structured as follows:

{
  "headers": [ ...
  ],
  "summary": { ...
  },
  "summary_by_facet": [
    {
      "facet": "core",
      "summary": { ...
      }
    },
    {
      "facet": "dev",
      "summary": { ...
      }
    },
    {
      "facet": "tests",
      "summary": { ...
      }
    },
    {
      "facet": "docs",
      "summary": { ...
      }
    },
    {
      "facet": "data",
      "summary": { ...
      }
    },
    {
      "facet": "examples",
      "summary": { ...
      }
    }
  ],
  "files": [
}

A sample “summary_by_facet” object generated by the previous scan (shortened):

"summary_by_facet": [
  {
    "facet": "core",
    "summary": {
      "license_expressions": [
        {
          "value": "mit",
          "count": 1
        },
      ],
      "copyrights": [
        {
          "value": "Copyright (c) Free Software Foundation, Inc.",
          "count": 2
        },
      ],
      "holders": [
        {
          "value": "The Apache Software Foundation",
          "count": 1
        },
      "authors": [
        {
          "value": "Gilles Vollant",
          "count": 1
        },
      ],
      "programming_language": [
        {
          "value": "C++",
          "count": 8
        },
      ]
    }
  },
  {
    "facet": "dev",
    "summary": {
      "license_expressions": [
        {
          "value": "zlib",
          "count": 5
        },
      "copyrights": [
        {
          "value": "Copyright Red Hat Middleware LLC, and individual contributors",
          "count": 1
        },
      ],
      "holders": [
        {
          "value": "Mark Adler",
          "count": 3
        },
      ],
      "authors": [
          "value": "Brian Stansberry",
          "count": 1
        },
      ],
      "programming_language": [
        {
          "value": "Java",
          "count": 7
        },
        {
          "value": "C++",
          "count": 5
        }
      ]
    }
  },
],

Note

Summaries for all the facets are generated by default, regardless of facets not having any files under them.

For users who want to know What is a Facet?.


--summary-key-files Option

Dependency

The option --summary-key-files is a sub-option of and requires the options --classify and --summary.

An example Scan:

scancode -clpieu --json-pp output.json samples --classify --summary --summary-key-files

Running the scan with --summary --summary-key-files Plugins creates summaries for key files with the same license, copyright and other scan information, at a codebase level (in addition to the codebase level general summary generated by --summary Plugin)

The resulting JSON file containing the scan results is structured as follows:

{
  "headers": [ ...
  ],
  "summary": {
    "license_expressions": [ ...
    ],
    "copyrights": [ ...
    ],
    "holders": [ ...
    ],
    "authors": [ ...
    ],
    "programming_language": [ ...
    ],
    "packages": []
  },
  "summary_of_key_files": {
    "license_expressions": [
      {
        "value": null,
        "count": 1
      }
    ],
    "copyrights": [
      {
        "value": null,
        "count": 1
      }
    ],
    "holders": [
      {
        "value": null,
        "count": 1
      }
    ],
    "authors": [
      {
        "value": null,
        "count": 1
      }
    ],
    "programming_language": [
      {
        "value": null,
        "count": 1
      }
    ]
  },
  "files": [

These following flags for each file/directory is also present (generated by --classify)

  • “is_legal”

  • “is_manifest”

  • “is_readme”

  • “is_top_level”

  • “is_key_file”


--summary-with-details Option

The --summary plugin summarizes license, copyright and other scan information at the codebase level. Now running the scan with the --summary-with-details plugin instead creates summaries at individual file/directories with the same license, copyright and other scan information, but at a file/directory level (in addition to the the codebase level summary).

An example Scan:

scancode -clpieu --json-pp output.json samples --summary-with-details

Note

--summary is redundant in a scan when --summary-with-details is already selected.

A sample file object in the scan results (a directory level summary of samples/arch) is structured as follows:

{
  "path": "samples/arch",
  "type": "directory",
  "name": "arch",
  "base_name": "arch",
  "extension": "",
  "size": 0,
  "date": null,
  "sha1": null,
  "md5": null,
  "mime_type": null,
  "file_type": null,
  "programming_language": null,
  "is_binary": false,
  "is_text": false,
  "is_archive": false,
  "is_media": false,
  "is_source": false,
  "is_script": false,
  "licenses": [],
  "license_expressions": [],
  "copyrights": [],
  "holders": [],
  "authors": [],
  "packages": [],
  "emails": [],
  "urls": [],
  "is_legal": false,
  "is_manifest": false,
  "is_readme": false,
  "is_top_level": true,
  "is_key_file": false,
  "summary": {
    "license_expressions": [
      {
        "value": "zlib",
        "count": 3
      },
      {
        "value": null,
        "count": 1
      }
    ],
    "copyrights": [
      {
        "value": null,
        "count": 1
      },
      {
        "value": "Copyright (c) Jean-loup Gailly",
        "count": 1
      },
      {
        "value": "Copyright (c) Jean-loup Gailly and Mark Adler",
        "count": 1
      },
      {
        "value": "Copyright (c) Mark Adler",
        "count": 1
      }
    ],
    "holders": [
      {
        "value": null,
        "count": 1
      },
      {
        "value": "Jean-loup Gailly",
        "count": 1
      },
      {
        "value": "Jean-loup Gailly and Mark Adler",
        "count": 1
      },
      {
        "value": "Mark Adler",
        "count": 1
      }
    ],
    "authors": [
      {
        "value": null,
        "count": 4
      }
    ],
    "programming_language": [
      {
        "value": "C++",
        "count": 3
      },
      {
        "value": null,
        "count": 1
      }
    ]
  },
  "files_count": 4,
  "dirs_count": 2,
  "size_count": 127720,
  "scan_errors": []
},

These following flags for each file/directory is also present (generated by --classify)

  • “is_legal”

  • “is_manifest”

  • “is_readme”

  • “is_top_level”

  • “is_key_file”