Add A Post-Scan Plugin
Scan plugins in scancode-toolkit
A lot of scancode features are built-in plugins which are present with scancode-toolkit source code and are usually enabled via the different scancode-toolkit CLI options and are grouped by the types of plugins.
Here are the major types of plugins:
Pre-scan plugins (scancode_pre_scan in entry points)
These plugins are run before the main scanning steps and are usually filtering of input files, or file classification steps, on whose results the main scan plugins depend on. The base plugin class to be extended is
PreScanPlugin
at /src/plugincode/pre_scan.py.Scan plugins (scancode_scan in entry points)
The are the scancode plugins which does the file scanning for useful information like license, copyrights, packages and others. These are run on multiprocessing for speed as they are done on a per-file basis, but there can also be post-processing steps on these which are run afterwards and have access to all the per-file scan results. The base plugin class to be extended is
ScanPlugin
at /src/plugincode/scan.py.Post-scan plugins (scancode_post_scan in entry points)
These are mainly data processing, summerizing and reporting plugins which depend on all the results for the scan plugins. These add new codebase level or file-level attributes, and even removes/modifies data as required for consolidation or summarization. The base plugin class to be extended is
PostScanPlugin
at /src/plugincode/post_scan.py.Output plugins (scancode_output in entry points)
Supported output options in scancode-toolkit are all plugins and these can also be multiple output options selected. These convert, process and writes the data in the specific file format as the output of the scanning procedures. The base plugin class to be extended is
OutputPlugin
at /src/plugincode/output.py.Output Filter Plugins (scancode_output_filter in entry points)
There are also output filter plugins which apply filters to the outputs and is modified. These filters can be based on whether resources had any detections, ignorables present in licenses and others. The base plugin class to be extended is
OutputFilterPlugin
at /src/plugincode/output_filter.py.Location Provider Plugins
These plugins provide pre-built binary libraries and utilities and their locations which are packaged to be used in scancode-toolkit. The base plugin class to be extended is
LocationProviderPlugin
at /src/plugincode/location_provider.py.
Built-In vs. Optional Installation
Built-In
Some post-scan plugins are installed when ScanCode itself is installed, and they are specified at
[options.entry_points]
in the setup.cfg file.
For example, the License Policy Plugin is a built-in plugin, whose code is located here:
https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/plugin_license_policy.py
These plugins do not require any additional installation steps and can be used as soon as ScanCode is up and running.
Optional
ScanCode is also designed to use post-scan plugins that must be installed separately from the installation of ScanCode. The code for this sort of plugin is located here:
https://github.com/nexB/scancode-plugins
This wiki page will focus on optional post-scan plugins.
Example Post-Scan Plugin: Hello ScanCode
To illustrate the creation of a simple post-scan plugin, we’ll create a hypothetical plugin named
Hello ScanCode
, which will print Hello ScanCode!
in your terminal after you’ve run a scan.
Your command will look like something like this:
scancode -i -n 2 <path to target codebase> --hello --json <path to JSON output file>
We’ll start by creating three folders:
Top-level folder –
/scancode-hello/
2nd-level folder –
/src/
3rd-level folder –
/hello_scancode/
1. Top-level folder – /scancode-hello/
In the
scancode-plugins
repository, in themisc
directory, add a folder with a relevant name, e.g.,scancode-hello
. This folder will hold all of your plugin code.Inside the
/scancode-hello/
folder you’ll need to add a folder namedsrc
and 7 files./src/
– This folder will contain your primary Python code and is discussed in more detail in the following section.
The 7 Files are:
.gitignore
– See, e.g., /scancode-ignore-binaries/.gitignore
/build/
/dist/
apache-2.0.LICENSE
– See, e.g., /scancode-ignore-binaries/apache-2.0.LICENSEMANIFEST.in
graft src
include setup.py
include setup.cfg
include .gitignore
include README.md
include MANIFEST.in
include NOTICE
include apache-2.0.LICENSE
global-exclude *.py[co] __pycache__ *.*~
NOTICE
– See, e.g., /scancode-ignore-binaries/NOTICEREADME.md
setup.cfg
[metadata]
license_file = NOTICE
[bdist_wheel]
universal = 1
[aliases]
release = clean --all bdist_wheel
setup.py
– This is an example of what oursetup.py
file would look like:
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
from __future__ import absolute_import
from __future__ import print_function
from glob import glob
from os.path import basename
from os.path import join
from os.path import splitext
from setuptools import find_packages
from setuptools import setup
desc = '''A ScanCode post-scan plugin to to illustrate the creation of a simple post-scan plugin.'''
setup(
name='scancode-hello',
version='1.0.0',
license='Apache-2.0 with ScanCode acknowledgment',
description=desc,
long_description=desc,
author='nexB',
author_email='info@aboutcode.org',
url='https://github.com/nexB/scancode-plugins/blob/main/misc/scancode-hello/',
packages=find_packages('src'),
package_dir={'': 'src'},
py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
include_package_data=True,
zip_safe=False,
classifiers=[
# complete classifier list: http://pypi.python.org/pypi?%3Aaction=list_classifiers
'Development Status :: 4 - Beta',
'Intended Audience :: Developers',
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python',
'Programming Language :: Python :: 3',
'Topic :: Utilities',
],
keywords=[
'scancode', 'plugin', 'post-scan'
],
install_requires=[
'scancode-toolkit',
],
entry_points={
'scancode_post_scan': [
'hello = hello_scancode.hello_scancode:SayHello',
],
}
)
2. 2nd-level folder – /src/
Add an
__init__.py
file inside thesrc
folder. This file can be empty, and is used to indicate that the folder should be treated as a Python package directory.Add a folder that will contain our primary code – we’ll name the folder
hello_scancode
. If you look at the example of thesetup.py
file above, you’ll see this line in theentry_points
section:
'hello = hello_scancode.hello_scancode:SayHello',
hello
refers to the name of the command flag.The first
hello_scancode
is the name of the folder we just created.The second
hello_scancode
is the name of the.py
file containing our code (discussed in the next section).SayHello
is the name of thePostScanPlugin
class we create in that file (see sample code below).
3. 3rd-level folder – /hello_scancode/
Add an
__init__.py
file inside thehello_scancode
folder. As noted above, this file can be empty.Add a
hello_scancode.py
file.
Imports
from plugincode.post_scan import PostScanPlugin
from plugincode.post_scan import post_scan_impl
from scancode import CommandLineOption
from scancode import POST_SCAN_GROUP
Create a PostScanPlugin
class
The PostScanPlugin
class
PostScanPlugin code)
inherits from the CodebasePlugin
class (see
CodebasePlugin code),
which inherits from the BasePlugin
class (see
BasePlugin code).
@post_scan_impl
class SayHello(PostScanPlugin):
"""
Illustrate a simple "Hello World" post-scan plugin.
"""
options = [
CommandLineOption(('--hello',),
is_flag=True, default=False,
help='Generate a simple "Hello ScanCode" greeting in the terminal.',
help_group=POST_SCAN_GROUP)
]
def is_enabled(self, hello, **kwargs):
return hello
def process_codebase(self, codebase, hello, **kwargs):
"""
Say hello.
"""
if not self.is_enabled(hello):
return
print('Hello ScanCode!!')
Load the plugin
To load and use the plugin in the normal course, navigate to the plugin’s root folder (in this example:
/plugins/scancode-hello/
) and runpip install .
(don’t forget the final.
).If you’re developing and want to test your work, save your edits and run
pip install -e .
from the same folder.
More-complex examples
This Hello ScanCode example is quite simple. For examples of more-complex structures and functionalities you can take a look at the other post-scan plugins for guidance and ideas.
One good example is the License Policy post-scan plugin. This plugin is installed when ScanCode
is installed and consequently is not located in the /plugins/
directory used for
manually-installed post-scan plugins. The code for the License Policy plugin can be found at
/scancode-toolkit/src/licensedcode/plugin_license_policy.py
and illustrates how a plugin can be used to analyze the results of a ScanCode scan using external
data files and add the results of that analysis as a new field in the ScanCode JSON output file.