dcc_json_toolkit.dcc_schema

Schema for DCC files.

DccSchema

DccSchema(
    xsd_schema_version: str, no_cache: bool = False, schema_url: str | None = None
)

XML schema manager to work directly with DCC files.

DccSchema has the purpose communicating/converting any type of data that follows a correct DCC schema. In other words, it allows the direct and bidirectional conversion XML <——> Python dict. It is structured as a wrapper to xmlschema.XMLSchema, encapsulating only those required methods.

To fulfill its task it requires to be initialized with the correct version number of the schema.

The schema requires of internet connexion for both validating and operating with the released XSD schema.

Parameters:
  • xsd_schema_version (str) –

    Version of the released XSD schema. E.G.: '3.3.0' or '3.4.0-rc.2'.

  • schema_url (str | None, default: None ) –

    URL path to where the schema is published. If not provided, it is considered that the XSD file is accessible at https://ptb.de/dcc/v{VERSION}/dcc.xsd

  • no_cache (bool, default: False ) –

    If True, the instance works directly with the hosted 'dcc.xsd' file, requiring of internet connexion to initialize the schema.

extract_elements

extract_elements(xml_source: str | DccSourceContent, element_name: str) -> list[dict]

Scans a source and extracts (as dictionaries) all the desired elements in it.

Parameters:
  • xml_source (str | DccSourceContent) –

    String with the XML contents or defining the path to an XML local file to a valid DCC.

  • element_name (str) –

    Name defining the element with the contents defined in the data parameter. This parameter is only required for subschemas. E.G.: dcc:list or si:real.

Returns:
  • elements( list[dict] ) –

    A list with all the elements (from any tree level) represented as a dict. If no element is found, an empty list is returned.

Raises:
  • DccSchemaError

    In the case the provided source is not a valid DCC or that it contains errors.

find_valid_xpaths

find_valid_xpaths(target_element: str, as_iterator: bool = False) -> list[str]
find_valid_xpaths(target_element: str, as_iterator: bool = True) -> Iterator[str]

Generator of all possible XPaths for any element.

Parameters:
  • target_element (str) –

    Name of the element to find out, starting with the corresponding namespace. Examples: 'dcc:list', 'si:real', 'dcc:name'.

  • as_iterator (bool, default: False ) –

    Flag to control whether the return value is either a list with all the possible XPaths or an iterator for each XPath.

Raises:
  • DccSchemaError
    1. When the provided target is not specified with its namespace.
    2. If the provided namespace is unknown for the schema version.

get_scanned_version staticmethod

get_scanned_version(dcc_file_path: str) -> str

Scans the DCC document to find out which XSD version it follows.

Notes

This method is deprecated and won't exist in any release greater than 1.1.

get_schema_uri

get_schema_uri(no_cache: bool) -> str

URI path to the released XSD schema.

Based on the no_cache parameter, the method return either the URL where the XSD file is published (no_cache=True) or the local path to the cached file (no_cache=False).

is_valid

is_valid(xml_source: str | DccSourceContent) -> bool

Check whether an XML file is a valid DCC schema.

to_dict

to_dict(xml_source: str | DccSourceContent) -> dict

Conversion of the file's content to a dictionary.

Parameters:
  • xml_source (str | DccSourceContent) –

    Either a path to an existing XML file or the xml contents to convert.

Returns:
  • dict

    The dictionary extracted from the xml_source. This returned instance is composed only from the contents of the provided source. All attributes are defined as the key @{attr_name}. See the examples for a better understanding.

Examples:

  1. A complete DCC as xml source.
    >>> xml_source = '''
    ... <dcc:digitalCalibrationCertificate
    ...     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    ...     xmlns:dcc="https://ptb.de/dcc"
    ...     xmlns:si="https://ptb.de/si"
    ...     xsi:schemaLocation="https://ptb.de/dcc https://ptb.de/dcc/v3.4.0-rc.2/dcc.xsd"
    ...     schemaVersion="3.4.0-rc.2">
    ...     <dcc:administrativeData...>
    ...     <dcc:measurementResults...>
    ... </dcc:digitalCalibrationCertificate>
    ... '''
    >>> dcc_as_dict = DccSchema("3.4.0-rc.2").to_dict(xml_source)
    >>> dcc_as_dict
    {
        '@xmlns:xsi' = "http://www.w3.org/2001/XMLSchema-instance",
        '@xmlns:dcc' = "https://ptb.de/dcc",
        '@xmlns:si' = "https://ptb.de/si",
        '@xsi:schemaLocation' = "https://ptb.de/dcc https://ptb.de/dcc/v3.4.0-rc.2/dcc.xsd",
        '@schemaVersion' = "3.4.0-rc.2",
        'dcc:administrativeData' = {...},
        'dcc:measurementResults' = {...},
    }
  2. A subschema of the DCC, holding a dcc_list element (a table).
    >>> xml_source = '''
    ... <dcc:list id="exampleTable" tableDimension="1">
    ...    <dcc:name>
    ...       <dcc:content lang="en">Example name</dcc:content>
    ...    </dcc:name>
    ...    <dcc:quantity...>
    ... </dcc:list>
    ... '''
    >>> table_as_dict = DccSchema("3.4.0-rc.2").to_dict(
    ...    xml_source, is_subschema=True
    ... )
    >>> table_as_dict
    {
        '@id' = "exampleTable",
        '@tableDimension' = 1,
        'dcc:name' = {
            'dcc:content' = [{'$': 'Example name', '@lang': 'en'}]
        },
        'dcc:quantity' = {...},
    }

to_xml_string

to_xml_string(data: dict, subschema_element: str | None = None) -> str

Encoding data from a dictionary into an XML string.

Parameters:
  • data (dict) –

    Content, structured as a dictionary, which is to be converted into an XML string.

  • subschema_element (str | None, default: None ) –

    Name defining the element with the contents defined in the data parameter. This parameter is only required for subschemas. E.G.: dcc:list or si:real.

Examples:

These examples show the inverse process for those at the .to_dict() method.

  1. A complete DCC:
    >>> dcc_data = {
    ... '@xmlns:xsi': "http://www.w3.org/2001/XMLSchema-instance",
    ... '@xmlns:dcc': "https://ptb.de/dcc",
    ... '@xmlns:si': "https://ptb.de/si",
    ... '@xsi:schemaLocation': "https://ptb.de/dcc https://ptb.de/dcc/v3.4.0-rc.2/dcc.xsd",
    ... '@schemaVersion': "3.4.0-rc.2",
    ... 'dcc:administrativeData': {...},
    ... 'dcc:measurementResults': {...},
    ... }
    >>> dcc_as_xml_string = DccSchema("3.4.0-rc.2").to_xml_string(dcc_data)
    >>> dcc_as_xml_string
    <dcc:digitalCalibrationCertificate
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns:dcc="https://ptb.de/dcc"
        xmlns:si="https://ptb.de/si"
        xsi:schemaLocation="https://ptb.de/dcc https://ptb.de/dcc/v3.4.0-rc.2/dcc.xsd"
        schemaVersion="3.4.0-rc.2">
        <dcc:administrativeData...>
        <dcc:measurementResults...>
    </dcc:digitalCalibrationCertificate>
  2. A subschema of the DCC, holding a dcc_list element (a table).
    >>> dcc_table = {
    ...     "@id": "exampleTable",
    ...     "@tableDimension": 1,
    ...     "dcc:name": {"dcc:content": [{"$": "Example name", "@lang": "en"}]},
    ...     "dcc:quantity": {...},
    ... }
    >>> table_as_xml_string = DccSchema("3.4.0-rc.2").to_xml_string(
    ...     dcc_table, subschema_element="dcc:table"
    ... )
    >>> table_as_xml_string
    <dcc:list id="exampleTable" tableDimension="1">
       <dcc:name>
          <dcc:content lang="en">Example name</dcc:content>
       </dcc:name>
       <dcc:quantity...>
    </dcc:list>

validate_dcc

validate_dcc(xml_source: str | DccSourceContent, concise: bool = True) -> list[str]
validate_dcc(
    xml_source: str | DccSourceContent, concise: bool = False
) -> list[XMLSchemaValidationError]

Validates an XML data against the XSD schema/component instance.

The function assumes the provided XML source is a valid XML. Any error raised by this function defines the source is not a valid XML.

Parameters:
  • xml_source (str | DccSourceContent) –

    Input to validate, which can be either:

    • A string with all the XML contents.
    • A path to the XML file to validate.

    If the source corresponds to a subschema, the first line must define the type of element with the correct XML structure. For example, considering a table, the first line of the xml_source must be <dcc:list ....

  • concise (bool, default: True ) –

    Flag to define the output format of the errors. If True, each error is formated into a one-liner string. If False, the complete error instance is returned.

Returns:
  • errors( list[str] | list[XMLSchemaValidationError] ) –

    List of all the errors obtained through the validation. If the file is valid, the return should be an empty list.

DccSchemaError

Bases: ValueError

Error related with invalid schema issues.