.. _cli-output-format-options:

ScanCode output formats
=======================

Scan results generated by ScanCode are available in different formats, to be specified by the
following options.

Quick reference
---------------

.. include::  /rst-snippets/cli-output-format-options.rst
   :start-line: 3

.. include::  /rst-snippets/note-snippets/cli-output-samples.rst

----

.. _cli-stdout:

.. include::  /rst-snippets/cli-output-to-stdout.rst

----

.. _cli-json-option:

``--json FILE``
---------------

    Among the ScanCode Output Formats, ``json`` is the most important one, and is recommended over
    others. ScanCode Workbench and other applications that use ScanCode Result data as input accept
    only the ``json`` format.

    **Example**

    The following code performs a scan on the samples directory, and publishes the results in
    ``json`` format

    .. code-block:: shell

        scancode -clpieu --json output.json samples

    .. include::  /rst-snippets/note-snippets/cli-output-json-ugly.rst

    .. figure:: data/cli-output-json-ugly.png

    The entire JSON file is structured in the following manner:

    At first some general information on the scan, what options were used, the number of files etc.
    And then all the files follow.

    .. code-block:: none

        {
          "headers": [
            {
              "tool_name": "scancode-toolkit",
              "tool_version": "3.1.1",
              "options": {
                "input": [
                  "samples/"
                ],
                "--copyright": true,
                "--email": true,
                "--info": true,
                "--json-pp": "output.json",
                "--license": true,
                "--package": true,
                "--url": true
              },
              "notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/aboutcode-org/scancode-toolkit/ for support and download.",
              "start_timestamp": "2019-10-19T191117.292858",
              "end_timestamp": "2019-10-19T191219.743133",
              "message": null,
              "errors": [],
              "extra_data": {
                "files_count": 36
              }
            }
          ],
          "files": [
            {
              "path": "samples",
              "type": "directory",
              ...
              "scan_errors": []
            },
            {
              "path": "samples/README",
              "type": "file",
              "name": "README",
              "base_name": "README",
              "extension": "",
              "size": 236,
              "date": "2019-02-12",
              "sha1": "2e07e32c52d607204fad196052d70e3d18fb8636",
              "md5": "effc6856ef85a9250fb1a470792b3f38",
              "mime_type": "text/plain",
              "file_type": "ASCII text",
              "programming_language": null,
              "is_binary": false,
              "is_text": true,
              "is_archive": false,
              "is_media": false,
              "is_source": false,
              "is_script": false,
              "license_detections": [],
              "detected_license_expression": None,
              "detected_license_expression_spdx": None,
              "copyrights": [],
              "holders": [],
              "authors": [],
              "package_data": [],
              "for_packages": [],
              "emails": [],
              "urls": [],
              "files_count": 0,
              "dirs_count": 0,
              "size_count": 0,
              "scan_errors": []
            },
            {...},
            ...
          ]
        }

----

.. _cli-json-pp-option:

``--json-pp FILE``
------------------

    ``json-pp`` stands for JSON Pretty-Print format. In the previous format, i.e. Simple ``json``,
    the whole output is printed in one line, which isn't well suited for getting information if
    you're looking at the file itself (or printing at stdout). So this option formats the output
    results in json but in a properly spaced and indented manner, and is easy to look at.

    The following code performs a scan on the samples directory, and publishes the results in
    ``json-pp`` format

    .. code-block:: shell

        scancode -clpieu --json-pp output.json samples

    **Example**

    .. code-block:: json

        {
          "path": "samples/zlib/iostream2/zstream.h",
          "type": "file",
          "name": "zstream.h",
          "base_name": "zstream",
          "extension": ".h",
          "size": 9283,
          "date": "2019-02-12",
          "sha1": "fca4540d490fff36bb90fd801cf9cd8fc695bb17",
          "md5": "a980b61c1e8be68d5cdb1236ba6b43e7",
          "sha1_git": "d9a10c0d8e868ebf8da0b3dc95bb0be634c34bfe",
          "mime_type": "text/x-c++",
          "file_type": "C++ source, ASCII text",
          "programming_language": "C++",
          "is_binary": false,
          "is_text": true,
          "is_archive": false,
          "is_media": false,
          "is_source": true,
          "is_script": false,
          "license_detections": [
            "license-expression": "mit-old-style",
            "matches": [
              {
                "license_expression": "mit-old-style",
                "score": 100.0,
                "rule_identifier": "mit-old-style_cmr-no_1.RULE",
                "matcher": "2-aho",
                "rule_length": 71,
                "matched_length": 71,
                "match_coverage": 100.0,
                "rule_relevance": 100
              }
            ]
            "identifier": "mit-old-style-ec759ae0-1234-f138-793e-356789e080c0"
          ],
          "detected_license_expressions": "mit-old-style",
          "detected_license_expressions_spdx": "LicenseRef-scancode-mit-old-style",
          "copyrights": [
            {
              "value": "Copyright (c) 1997 Christian Michelsen Research AS Advanced Computing",
              "start_line": 3,
              "end_line": 5
            }
          ],
          "holders": [
            {
              "value": "Christian Michelsen Research AS Advanced Computing",
              "start_line": 3,
              "end_line": 5
            }
          ],
          "authors": [],
          "package_data": [],
          "emails": [],
          "urls": [
            {
              "url": "http://www.cmr.no/",
              "start_line": 7,
              "end_line": 7
            }
          ],
          "files_count": 0,
          "dirs_count": 0,
          "size_count": 0,
          "scan_errors": []
        },

    This is the recommended output option for ScanCode-Toolkit.

    .. include::  /rst-snippets/note-snippets/cli-output-format-synopsis.rst

----

.. _cli-json-lines-option:

``--json-lines FILE``
---------------------

    ScanCode also has a ``--json-lines`` format option, where each report of a file scanned is
    formatted in one line.

    **Example**

    The following code performs a scan on the samples directory, and publishes the results in
    ``json-lines`` format

    .. code-block:: shell

        scancode -clpieu --json-lines output.json samples

    Here is a sample line from a report generated by the ``jsonlines`` format

    .. code-block:: none

        {"files":[{"path":"samples/zlib/ada",licenses":[],"copyrights":[],"packages":[]}]}

    The header information is also formatted in one line (i.e. The First Line of the file).

    The whole Output file looks like

    .. code-block:: none

        {"headers":[{"tool_name":"scancode-toolkit","tool_version":"3.1.1","options":{"input":["samples/"],"--copyright":true,"--email":true,"--info":true,"--json-lines":"output.json","--license":true,"--package":true,"--url":true},"notice":"Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/aboutcode-org/scancode-toolkit/ for support and download.","start_timestamp":"2019-10-19T210920.143831","end_timestamp":"2019-10-19T211052.048182","message":null,"errors":[],"extra_data":{"files_count":36}}]}
        {"files":[{"path":"samples" ... "scan_errors":[]}]}
        {"files":[{"path":"samples/README", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/screenshot.png", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/arch", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/arch/zlib.tar.gz", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/arch/zlib.tar.gz-extract", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8/adler32.c", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8/zlib.h", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/arch/zlib.tar.gz-extract/zlib-1.2.8/zutil.h", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/EULA", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/LICENSE", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/licenses", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/licenses/apache-1.1.txt", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/licenses/apache-2.0.txt", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/licenses/bouncycastle.txt", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/licenses/cpl-1.0.txt", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/licenses/lgpl.txt", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src/FixedMembershipToken.java", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src/GuardedBy.java", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src/ImmutableReference.java", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src/RATE_LIMITER.java", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src/RouterStub.java", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src/RouterStubManager.java", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/JGroups/src/S3_PING.java", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/adler32.c", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/deflate.c", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/deflate.h", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/zlib.h", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/zutil.c", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/zutil.h", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/ada", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/ada/zlib.ads", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/dotzlib", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/dotzlib/AssemblyInfo.cs", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/dotzlib/ChecksumImpl.cs", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/dotzlib/LICENSE_1_0.txt", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/dotzlib/readme.txt", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/gcc_gvmat64" ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/gcc_gvmat64/gvmat64.S" ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/infback9", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/infback9/infback9.c", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/infback9/infback9.h", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/iostream2", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/iostream2/zstream.h", ... "scan_errors":[]}]}
        {"files":[{"path":"samples/zlib/iostream2/zstream_test.cpp", ... "scan_errors":[]}]}


    .. include::  /rst-snippets/note-snippets/cli-output-json-lines.rst

----

.. _cli-comparing-json-output-file-formats:

Comparing different ``json`` output formats
-------------------------------------------

    Default ``--json`` Output:

    .. figure:: data/cli-output-json.png

    ``--json-pp`` Output:

    .. figure:: data/cli-output-jsonpp.png

    ``--json-lines`` Output:

    .. figure:: data/cli-output-json-lines.png

----

.. _cli-rdf-option:

``--spdx-rdf FILE``
-------------------

    `SPDX <https://spdx.org/>`_ stands for "Software Package and Data Exchange" and is an open standard
    for communicating software bill of material information (including components, licenses,
    copyrights, and security references).

    **Example**

    The following code performs a scan on the samples directory, and publishes the results in
    ``spdx-rdf`` format

    .. code-block:: shell

        scancode -clpieu --spdx-rdf output.spdx samples

    Learn more about SPDX specifications `here <https://spdx.org/specifications>`_ and in this GitHub
    `repository <https://github.com/spdx/spdx-spec>`_.

    Here the file is structured as a dictionary of named properties and classes using W3C's
    `RDF Technology <https://www.w3.org/RDF/>`_.

    .. figure:: data/cli-output-spdx-rdf1.png

----

.. _cli-spdx-tv-option:

``--spdx-tv FILE``
------------------

    This format is another SPDX variant, with the output file being structured in the following
    manner:

    The following code performs a scan on the samples directory, and publishes the results in
    ``spdx-tv`` format

    .. code-block:: shell

        scancode -clpieu --spdx-tv output.spdx samples

    A SPDX-TV file starts with

    .. code-block:: none

        # Document Information

        SPDXVersion: SPDX-2.1
        DataLicense: CC0-1.0
        DocumentComment: <text>Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
        OR CONDITIONS OF ANY KIND, either express or implied. No content created from
        ScanCode should be considered or used as legal advice. Consult an Attorney
        for any legal advice.
        ScanCode is a free software code scanning tool from nexB Inc. and others.
        Visit https://github.com/aboutcode-org/scancode-toolkit/ for support and download.</text>


        # Creation Info

        Creator: Tool: ScanCode 2.2.1
        Created: 2019-09-22T21:55:04Z

    After a section titled ``#Packages``, a list follows.

    .. figure:: data/cli-output-spdx-tv-package.png

    Each File information is listed under a ``#File`` title, for each of the files.

    .. hlist::
        :columns: 3

        - FileName
        - FileChecksum
        - LicenseConcluded
        - LicenseInfoInFile
        - FileCopyrightText

    **Example**

    .. figure:: data/cli-output-spdx-tv-file.png

    After the files section, there's a section for licenses under a ``#Licences`` title, with the
    following information for each license:

    .. hlist::
        :columns: 3

        - LicenseID
        - LicenseComment
        - ExtractedText

    .. figure:: data/cli-output-spdx-tv-licenses.png

----

.. _cli-html-option:

``--html FILE``
---------------

    ScanCode supports formatting the Output result is a simple ``html`` format, to open with your
    favorite browser. This helps quick visualization of the detected license/copyright and other
    main information in the form of tables.

    The following code performs a scan on the samples directory, and publishes the results in
    HTML format

    .. code-block:: shell

        scancode -clpieu --html output.html samples

    The HTML page generated has these following Tables:

    .. hlist::
        :columns: 2

        - Copyright and Licenses Information
        - File Information
        - Package Information
        - License References (SPDX ID, Links to spdx/scancode/licensedb/License Homepage)

    .. include::  /rst-snippets/note-snippets/cli-output-html-license-references.rst

    .. figure:: data/cli-output-html1.png

    .. figure:: data/cli-output-html2.png

    .. figure:: data/cli-output-html3.png

----

.. _cli-html-app-option:

``--html-app FILE``
-------------------

    ScanCode also supports formatting the output in a HTML visualization tool, which is more
    helpful than the standard HTML format.

    .. include::  /rst-snippets/warning-snippets/cli-output-htmlapp-deprecated.rst

    The following code performs a scan on the samples directory, and publishes the results in
    ``html-app`` format

      .. code-block:: shell

      scancode -clpieu --html-app output.html samples

    The Files scanned are shown in the left sidebar, and the section on the right contains separate
    tabs for the following:

    .. hlist::
        :columns: 2

        - License Summary
        - Copyright Summary
        - Clues
        - File Details
        - Packages

    .. include::  /rst-snippets/note-snippets/cli-output-htmlapp-search.rst

    .. figure:: data/cli-output-html-app1.png

    .. figure:: data/cli-output-html-app2.png

    .. figure:: data/cli-output-html-app3.png

----

.. _cli-csv-option:

``--csv <FILE>``
----------------

    ScanCode can publish results in the useful ``.csv`` format.

    .. note::

        This option is deprecated and will be replaced by new CSV and tabular
        output formats in the next ScanCode release. Visit
        https://github.com/aboutcode-org/scancode-toolkit/issues/3043
        for details and to provide inputs and feedback.

    The following code performs a scan on the samples directory, and publishes the results in
    ``csv`` format

    .. code-block:: shell

        scancode -lpceiu --csv sample.csv samples

    The first line of the csv file contains the headings, and they are:

    .. hlist::
        :columns: 3

        - Resource,
        - type,
        - name,
        - base_name,
        - extension,
        - date,
        - size,
        - sha1,
        - md5,
        - sha1_git,
        - files_count,
        - mime_type,
        - file_type,
        - programming_language,
        - is_binary,
        - is_text,
        - is_archive,
        - is_media,
        - is_source,
        - is_script,
        - scan_errors,
        - license__key,
        - license__score,
        - license__short_name,
        - license__category,
        - license__owner,
        - license__homepage_url,
        - license__text_url,
        - license__reference_url,
        - license__spdx_license_key,
        - license__spdx_url,
        - matched_rule__identifier,
        - matched_rule__license_choice,
        - matched_rule__licenses,
        - copyright,
        - copyright_holder,
        - author,
        - email,
        - start_line,
        - end_line,
        - url,
        - package__type,
        - package__name,
        - package__version,
        - package__primary_language,
        - package__summary,
        - package__description,
        - package__size,
        - package__release_date,
        - package__homepage_url,
        - package__notes,
        - package__bug_tracking_url,
        - package__vcs_repository,
        - package__copyright_top_level

    Each subsequent line represents one element, i.e. can be any of the following:

    .. hlist::
        :columns: 5

        - license
        - copyright
        - package
        - email
        - url

    So if there's multiple elements in a file, they are each given an entry with the details mentioned
    earlier.

    .. figure:: data/cli-output-csv.png

----

.. _cli-cyclonedx-json-option:

``--cyclonedx FILE``
--------------------
    ScanCode also supports the `CycloneDx <https://cyclonedx.org/specification/overview/>`_ output format

    Please note that this output format is only useful when scanning with the ``--package`` option

    This output format is particularly useful if you want to process ScanCode results
    in downstream tools that can't process ScanCode's native JSON output,
    but do support CycloneDx BOMs.

    To run an example scan on the test resources try

    .. code-block:: shell

    scancode --package --cyclonedx=bom.json tests/formattedcode/data/cyclonedx/simple

    If you prefer XML output over JSON, please have a look at the ``--cyclonedx-xml`` option instead

____

.. _cli-cyclonedx-xml-option:

``--cyclonedx-xml FILE``
-------------------------

    This option allows outputting CycloneDx BOMs in XML format instead of JSON

    **Example**

    .. code-block:: shell

    scancode --package --cyclonedx-xml=bom.xml tests/formattedcode/data/cyclonedx/simple

____

..
  ToDo:
  Document these output options:

    --custom-output FILE    Write scan output to FILE formatted with the custom
                            Jinja template file.
    --debian FILE           Write scan output in machine-readable Debian
                            copyright format to FILE.
