Scancode Output Formats

Scan results generated by Scancode are available in different formats, to be specified by the following options.

All Scan Output Options

--json FILE

Write scan output as compact JSON to FILE.

--json-pp FILE

Write scan output as pretty-printed JSON to FILE. This is one of the recommended output formats and contains all the data scancode can show along with the YAML output format.

--json-lines FILE

Write scan output as JSON Lines to FILE.

--yaml FILE

Write scan output as YAML to FILE. This is one of the recommended output formats and contains all the data scancode can show along with the JSON output format.

--csv FILE

DEPRECATED: Write scan output as CSV to FILE. This option is deprecated and will be replaced by new CSV and tabular output formats in the next ScanCode release. Visit this issue for details, and to provide input and feedback: https://github.com/nexB/scancode-toolkit/issues/3043

--html FILE

Write scan output as HTML to FILE.

--custom-output

Write scan output to FILE formatted with the custom Jinja template file.

Mandatory Sub-option:

--custom-template FILE

--custom-template FILE

Use this Jinja template FILE as a custom template.

Sub-Option of: --custom-output

--debian FILE

Write scan output in machine-readable Debian copyright format to FILE.

--spdx-rdf FILE

Write scan output as SPDX RDF to FILE.

--spdx-tv FILE

Write scan output as SPDX Tag/Value to FILE.

--html-app FILE

[DEPRECATED] Use scancode-workbench instead. Write scan output as a mini HTML application to FILE.

--cyclonedx FILE

Write scan output as a CycloneDx 1.3 BOM in pretty-printed JSON format to FILE

--cyclonedx-xml FILE

Write scan output as a CycloneDx 1.3 BOM in pretty-printed XML format to FILE

Warning

The html-app feature has been deprecated and you should use Scancode Workbench instead to visualize scan results. The official Repository link. Also refer How to Visualize Scan results.

Note

You can Output Scan Results in two different file formats simultaniously in one Scan. An example - scancode -clpieu --json-pp output.json --html output.html samples.

Note

All the examples and snippets that follows has been generated by scanning the samples folder distributed with scancode-toolkit.

Print to `stdout` (Terminal)

If you want to format the output in JSON and print it at stdout, you can replace the JSON filename with a “-”, like --json-pp - instead of --json-pp output.json.

The following command will output the scan results in JSON format to stdout (In the Terminal):

./scancode -clpieu --json-pp - samples/

`--json FILE`

Among the ScanCode Output Formats, json is the most important one, and is recommended over others. Scancode Workbench and other applications that use Scancode Result data as input accept only the json format.

The following code performs a scan on the samples directory, and publishes the results in json format:

scancode -clpieu --json output.json samples

Note

The default json format prints the whole report without line breaks/spaces/indentations, which can be ugly to look at.

The entire JSON file is structured in the following manner:

At first some general information on the scan, what options were used, the number of files etc. And then all the files follow.

{
  "headers": [
    {
      "tool_name": "scancode-toolkit",
      "tool_version": "3.1.1",
      "options": {
        "input": [
          "samples/"
        ],
        "--copyright": true,
        "--email": true,
        "--info": true,
        "--json-pp": "output.json",
        "--license": true,
        "--package": true,
        "--url": true
      },
      "notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
      "start_timestamp": "2019-10-19T191117.292858",
      "end_timestamp": "2019-10-19T191219.743133",
      "message": null,
      "errors": [],
      "extra_data": {
        "files_count": 36
      }
    }
  ],
  "files": [
    {
      "path": "samples",
      "type": "directory",
      ...
      "scan_errors": []
    },
    {
      "path": "samples/README",
      "type": "file",
      "name": "README",
      "base_name": "README",
      "extension": "",
      "size": 236,
      "date": "2019-02-12",
      "sha1": "2e07e32c52d607204fad196052d70e3d18fb8636",
      "md5": "effc6856ef85a9250fb1a470792b3f38",
      "mime_type": "text/plain",
      "file_type": "ASCII text",
      "programming_language": null,
      "is_binary": false,
      "is_text": true,
      "is_archive": false,
      "is_media": false,
      "is_source": false,
      "is_script": false,
      "license_detections": [],
      "detected_license_expression": None,
      "detected_license_expression_spdx": None,
      "copyrights": [],
      "holders": [],
      "authors": [],
      "package_data": [],
      "for_packages": [],
      "emails": [],
      "urls": [],
      "files_count": 0,
      "dirs_count": 0,
      "size_count": 0,
      "scan_errors": []
    },
    {...},
    ...
  ]
}

`--json-pp FILE`

json-pp stands for JSON Pretty-Print format. In the previous format, i.e. Simple json, the whole output is printed in one line, which isn’t well suited for getting information if you’re looking at the file itself (or printing at stdout). So this option formats the output results in json but in a properly spaced and indented manner, and is easy to look at.

The following code performs a scan on the samples directory, and publishes the results in json-pp format:

scancode -clpieu --json-pp output.json samples

A sample JSON output for an individual file will look like:

{
  "path": "samples/zlib/iostream2/zstream.h",
  "type": "file",
  "name": "zstream.h",
  "base_name": "zstream",
  "extension": ".h",
  "size": 9283,
  "date": "2019-02-12",
  "sha1": "fca4540d490fff36bb90fd801cf9cd8fc695bb17",
  "md5": "a980b61c1e8be68d5cdb1236ba6b43e7",
  "mime_type": "text/x-c++",
  "file_type": "C++ source, ASCII text",
  "programming_language": "C++",
  "is_binary": false,
  "is_text": true,
  "is_archive": false,
  "is_media": false,
  "is_source": true,
  "is_script": false,
  "license_detections": [
    "license-expression": "mit-old-style",
    "matches": [
      {
        "license_expression": "mit-old-style",
        "score": 100.0,
        "rule_identifier": "mit-old-style_cmr-no_1.RULE",
        "matcher": "2-aho",
        "rule_length": 71,
        "matched_length": 71,
        "match_coverage": 100.0,
        "rule_relevance": 100
      }
    ]
    "identifier": "mit-old-style-ec759ae0-1234-f138-793e-356789e080c0"
  ],
  "detected_license_expressions": "mit-old-style",
  "detected_license_expressions_spdx": "LicenseRef-scancode-mit-old-style",
  "copyrights": [
    {
      "value": "Copyright (c) 1997 Christian Michelsen Research AS Advanced Computing",
      "start_line": 3,
      "end_line": 5
    }
  ],
  "holders": [
    {
      "value": "Christian Michelsen Research AS Advanced Computing",
      "start_line": 3,
      "end_line": 5
    }
  ],
  "authors": [],
  "package_data": [],
  "emails": [],
  "urls": [
    {
      "url": "http://www.cmr.no/",
      "start_line": 7,
      "end_line": 7
    }
  ],
  "files_count": 0,
  "dirs_count": 0,
  "size_count": 0,
  "scan_errors": []
},

This is the recommended Output option for Scancode Toolkit.

`--json-lines FILE`

ScanCode also has a --json-lines format option, where each report of a file scanned is formatted in one line.

The following code performs a scan on the samples directory, and publishes the results in json-lines format:

scancode -clpieu --json-lines output.json samples

Here is a sample line from a report generated by the jsonlines format:

{"files":[{"path":"samples/zlib/ada",licenses":[],"copyrights":[],"packages":[]}]}

The header information is also formatted in one line (i.e. The First Line of the file).

The whole Output file looks like:

{"headers":[{"tool_name":"scancode-toolkit","tool_version":"3.1.1","options":{"input":["samples/"],"--copyright":true,"--email":true,"--info":true,"--json-lines":"output.json","--license":true,"--package":true,"--url":true},"notice":"Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.","start_timestamp":"2019-10-19T210920.143831","end_timestamp":"2019-10-19T211052.048182","message":null,"errors":[],"extra_data":{"files_count":36}}]}
{"files":[{"path":"samples" ... "scan_errors":[]}]}
{"files":[{"path":"samples/README", ... "scan_errors":[]}]}
{"files":[{"path":"samples/screenshot.png", ... "scan_errors":[]}]}
{"files":[{"path":"samples/arch", ... "scan_errors":[]}]}
{"files":[{"path":"samples/arch/zlib.tar.gz", ... "scan_errors":[]}]}
{"files":[{"path":"samples/arch/zlib.tar.gz-extract", ... "scan_errors":[]}]}
{"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8", ... "scan_errors":[]}]}
{"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8/adler32.c", ... "scan_errors":[]}]}
{"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8/zlib.h", ... "scan_errors":[]}]}
{"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8/zutil.h", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/EULA", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/LICENSE", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/licenses", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/licenses/apache-1.1.txt", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/licenses/apache-2.0.txt", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/licenses/bouncycastle.txt", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/licenses/cpl-1.0.txt", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/licenses/lgpl.txt", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src/FixedMembershipToken.java", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src/GuardedBy.java", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src/ImmutableReference.java", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src/RATE_LIMITER.java", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src/RouterStub.java", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src/RouterStubManager.java", ... "scan_errors":[]}]}
{"files":[{"path":"samples/JGroups/src/S3_PING.java", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/adler32.c", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/deflate.c", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/deflate.h", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/zlib.h", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/zutil.c", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/zutil.h", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/ada", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/ada/zlib.ads", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/dotzlib", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/dotzlib/AssemblyInfo.cs", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/dotzlib/ChecksumImpl.cs", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/dotzlib/LICENSE_1_0.txt", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/dotzlib/readme.txt", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/gcc_gvmat64" ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/gcc_gvmat64/gvmat64.S" ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/infback9", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/infback9/infback9.c", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/infback9/infback9.h", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/iostream2", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/iostream2/zstream.h", ... "scan_errors":[]}]}
{"files":[{"path":"samples/zlib/iostream2/zstream_test.cpp", ... "scan_errors":[]}]}

Note

This jsonlines format also omits other file information like type, name, date, extension, sha1 and md5 hashes, programming language etc.

Comparing Different `json` Output Formats

Default --json Output:

--json-pp Output:

--json-lines Output:

`--spdx-rdf FILE`

SPDX stands for “Software Package and Data Exchange” and is an open standard for communicating software bill of material information (including components, licenses, copyrights, and security references).

The following code performs a scan on the samples directory, and publishes the results in spdx-rdf format:
scancode -clpieu --spdx-rdf output.spdx samples
Learn more about SPDX specifications here and in this GitHub repository.

Here the file is structured as a dictionary of named properties and classes using W3C’s RDF Technology.

`--spdx-tv FILE`

This format is another SPDX variant, with the output file being structured in the following manner:

The following code performs a scan on the samples directory, and publishes the results in spdx-tv format:
scancode -clpieu --spdx-tv output.spdx samples
A SPDX-TV file starts with:
# Document Information

SPDXVersion: SPDX-2.1
DataLicense: CC0-1.0
DocumentComment: <text>Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. No content created from
ScanCode should be considered or used as legal advice. Consult an Attorney
for any legal advice.
ScanCode is a free software code scanning tool from nexB Inc. and others.
Visit https://github.com/nexB/scancode-toolkit/ for support and download.</text>


# Creation Info

Creator: Tool: ScanCode 2.2.1
Created: 2019-09-22T21:55:04Z
After a section titled #Packages, a list follows.

Each File information is listed under a #File title, for each of the files.

FileName

FileChecksum

LicenseConcluded

LicenseInfoInFile

FileCopyrightText

An example goes as follows:

After the files section, there’s a section for licenses under a #Licences title, with the following information for each license:

LicenseID

LicenseComment

ExtractedText

Here’s an example:

`--html FILE`

ScanCode supports formatting the Output result is a simple html format, to open with your favorite browser. This helps quick visualization of the detected license/copyright and other main information in the form of tables.

The following code performs a scan on the samples directory, and publishes the results in HTML format:
scancode -clpieu --html output.html samples
The HTML page generated has these following Tables:

Copyright and Licenses Information

File Information

Package Information

Licenses (Links to Dejacode/License Homepage)

`--html-app FILE`

ScanCode also supports formatting the output in a HTML visualization tool, which is more helpful than the standard HTML format.

Warning

The html-app feature has been deprecated and you should use Scancode Workbench instead to visualize scan results. The official Repository link. Also refer How to Visualize Scan results.

The following code performs a scan on the samples directory, and publishes the results in html-app format:
scancode -clpieu --html-app output.html samples
The Files scanned are shown in the left sidebar, and the section on the right contains separate tabs for the following:

License Summary

Copyright Summary

Clues

File Details

Packages

Note

The HTML app also contains a Search option to easily find what you are looking for. But the HTML app output is deprecated and we recommend using scancode-workbench instead: https://github.com/nexB/scancode-workbench.

`--csv FILE`

ScanCode can publish results in the useful .csv format.

Note

This option is deprecated and will be replaced by new CSV and tabular output formats in the next ScanCode release. Visit https://github.com/nexB/scancode-toolkit/issues/3043 for details and to provide inputs and feedback.

The following code performs a scan on the samples directory, and publishes the results in csv format:
scancode -lpceiu --csv sample.csv samples
The first line of the csv file contains the headings, and they are:

Resource,

type,

name,

base_name,

extension,

date,

size,

sha1,

md5,

files_count,

mime_type,

file_type,

programming_language,

is_binary,

is_text,

is_archive,

is_media,

is_source,

is_script,

scan_errors,

license__key,

license__score,

license__short_name,

license__category,

license__owner,

license__homepage_url,

license__text_url,

license__reference_url,

license__spdx_license_key,

license__spdx_url,

matched_rule__identifier,

matched_rule__license_choice,

matched_rule__licenses,

copyright,

copyright_holder,

author,

email,

start_line,

end_line,

url,

package__type,

package__name,

package__version,

package__primary_language,

package__summary,

package__description,

package__size,

package__release_date,

package__homepage_url,

package__notes,

package__bug_tracking_url,

package__vcs_repository,

package__copyright_top_level

Each subsequent line represents one element, i.e. can be any of the following:

license

copyright

package

email

url

So if there’s multiple elements in a file, they are each given an entry with the details mentioned earlier.

`--cyclonedx FILE`

Scancode also supports the CycloneDx output format

Please note that this output format is only useful when scanning with the --package option

This output format is particularly useful if you want to process ScanCode results in downstream tools that can’t process ScanCode’s native JSON output, but do support CycloneDx BOMs.

To run an example scan on the test resources try: ./scancode --package --cyclonedx=bom.json tests/formattedcode/data/cyclonedx/simple

If you prefer XML output over JSON, please have a look at the --cyclonedx-xml option instead

`--cyclonedx-xml FILE`

This option allows outputting CycloneDx BOMs in XML format instead of JSON

To run an example scan on the test resources try: ./scancode --package --cyclonedx-xml=bom.xml tests/formattedcode/data/cyclonedx/simple

Custom Output Format

While the three built-in output formats are convenient for a verity of use-cases, one may wish to create their own output template, using the following arguments:

``--custom-output FILE --custom-template TEMP_FILE``

ScanCode makes this very easy, as it uses the popular Jinja2 template engine. Simply pass the path to the custom template to the --custom-template argument, or drop it in a folder to src/scancode/templates directory.

For example, if I wanted a simple CLI output I would create a template2.html with the particular data I wish to see. In this case, I am only interested in the license and copyright data for this particular scan.

## template.txt:
[
    {% if files.license_copyright %}
        {% for location, data in files.license_copyright.items() %}
            {% for row in data %}
  location:"{{ location }}",
  {% if row.what == 'copyright' %}copyright:"{{ row.value|escape }}",{% endif %}
             {% endfor %}
         {% endfor %}
    {% endif %}
]

.. note::

    File name and extension does not matter for the template file.

Now I can run ScanCode using my newly created template:

$ scancode -clpeui --custom-output output.txt --custom-template template.txt samples
Scanning files...
  [####################################]  46
Scanning done.

Now the results are saved in output.txt and we can easily view them with head output.txt:

[
  location:"samples/JGroups/LICENSE",
  copyright:"Copyright (c) 1991, 1999 Free Software Foundation, Inc.",

  location:"samples/JGroups/LICENSE",
  copyright:"copyrighted by the Free Software Foundation",
]

For a more elaborate template, refer this default template given with Scancode, to generate HTML output with the --html output format option.

Documentation on Jinja templates.

Scancode Output Formats

All Scan Output Options

Print to stdout (Terminal)

--json FILE

--json-pp FILE

--json-lines FILE

Comparing Different json Output Formats

--spdx-rdf FILE

--spdx-tv FILE

--html FILE

--html-app FILE

--csv FILE

--cyclonedx FILE

--cyclonedx-xml FILE

Custom Output Format

Print to `stdout` (Terminal)

`--json FILE`

`--json-pp FILE`

`--json-lines FILE`

Comparing Different `json` Output Formats

`--spdx-rdf FILE`

`--spdx-tv FILE`

`--html FILE`

`--html-app FILE`

`--csv FILE`

`--cyclonedx FILE`

`--cyclonedx-xml FILE`