Transformation

If you want to know more details about transformation passes, please take a look at section “Transformation Pass” in chapter Internals.

Submodules

Transformation Passes

Base Class

Guide to writing QONNX transformations

  • Your transformation must inherit the Transformation abstract base class.

  • Your transformation’s apply function should take in a ModelWrapper, and return a tuple with (transformed_model: ModelWrapper, model_was_changed: Bool)

  • The transformations are meant to be applied using the .transform function in ModelWrapper. This makes a deep copy of the input model by default, so you don’t have to.

  • model_was_changed indicates whether your transformation made any changes to the model. If you know your transformation needs to be called only once and repeated calls have no further effect, you can return False even if the model was changed.

  • You MUST return model_was_changed=False at some point when your transformation is called multiple times, otherwise apply_repeated() will loop infinitely.

  • If you cannot guarantee that the transformation will reach a fixed point, you must declare this, return model_was_changed = False and let the user manually re-apply the transform.

class qonnx.transformation.base.NodeLocalTransformation(num_workers=None)

Bases: Transformation

Parent class for transformations, which can be executed locally to one node by accessing and modifying the attributes of only that node. This class can then automatically parallelize the transformation. Transformations sublcassing NodeLocalTransformation must implement the abstract method applyNodeLocal(). A read-only copy of the model is available as a member variable ref_input_model, but any modifications there will be disregarded.

To control the degree of parallelization, specify the num_workers argument in the constructor, using one of the following values: * None: use NUM_DEFAULT_WORKERS environment variable * 0: use all available CPU cores * (any other int>0): set number of parallel workers

apply(model)
abstract applyNodeLocal(node)
class qonnx.transformation.base.Transformation

Bases: ABC

Transformation class all transformations are based on. Contains only abstract method apply() every transformation has to fill.

abstract apply(model)

qonnx.transformation.batchnorm_to_affine

class qonnx.transformation.batchnorm_to_affine.BatchNormToAffine

Bases: Transformation

Replaces any test-time BatchNorm layers with Mul-Add layers.

apply(model)

qonnx.transformation.bipolar_to_xnor

class qonnx.transformation.bipolar_to_xnor.ConvertBipolarMatMulToXnorPopcount

Bases: Transformation

Convert MatMul nodes with all-bipolar inputs to XnorPopcountMatMul and associated result correction.

apply(model)

qonnx.transformation.change_3d_tensors_to_4d

class qonnx.transformation.change_3d_tensors_to_4d.Change3DTo4DTensors

Bases: Transformation

Replaces 3D tensors with 4D tensors assuming the following format: [N, C, H] -> [N, C, H, 1]. The attributes of a (specific) set of supported nodes are changed accordingly. If the graph contains unsupported nodes, a warning is raised and the transformation is not applied.

apply(model)

qonnx.transformation.change_batchsize

class qonnx.transformation.change_batchsize.ChangeBatchSize(bsize)

Bases: Transformation

Change the batch size dimension to the given value for the entire graph by changing it for the global input/output and removing all intermediate shapes (will need a call to shape inference to restore shapes). Will attempt to handle any Reshape nodes with constant shape parameters by changing the batch size dimension value in the parameter.

apply(model: ModelWrapper)

qonnx.transformation.change_datalayout

class qonnx.transformation.change_datalayout.ChangeDataLayoutQuantAvgPool2d

Bases: Transformation

Replace QuantAvgPool2d with datalayout (N,C,H,W) with Transpose nodes and QuantAvgPool2dNHWC with datalayout (N,H,W,C)

apply(model)

qonnx.transformation.channels_last

class qonnx.transformation.channels_last.AbsorbChanFirstIntoMatMul

Bases: Transformation

Removes a transpose to channels first node if it is in front of a Flatten and MatMul (or Gemm) node.

The channels first transpose is fused into the initializer of the Quant node acting as a weight tensor for the MatMul/Gemm node. Reshape nodes with shape [1, -1] are also supported instead of Flatten nodes. Independent of whether the flattening operation was performed by a Flatten node or a Resphape node, a Flatten node will be reinserted in-front of the MatMul node.

Note: This transformation removes some of the tensor shapes on the down-stream path.

Thus running shape inference afterwards is advised.

apply(model)
class qonnx.transformation.channels_last.ConvertToChannelsLastAndClean(make_input_channels_last=False)

Bases: Transformation

Converts data layout dependent nodes to ChannelsLast nodes and inserts transformations. Then it tries to eliminate as many transformations as possible and moves the still existing ones as far upstream as possible.

Parameters:

make_input_channels_last (bool) – Also makes the input of the network channels last, otherwise a transpose node will be left at the beginning of the network. Defaults to False

apply(model)
class qonnx.transformation.channels_last.InsertChannelsLastDomainsAndTrafos

Bases: Transformation

Inserts ChannelsLast domain, where required and also inserts required transposes.

apply(model)
class qonnx.transformation.channels_last.MoveChanFirstDownstream

Bases: Transformation

Moves channel first transformations further downstream.

apply(model)
class qonnx.transformation.channels_last.MoveChanLastUpstream

Bases: Transformation

Moves channel last transformations further upstream.

apply(model)
class qonnx.transformation.channels_last.RemoveConsecutiveChanFirstAndChanLastTrafos

Bases: Transformation

Remove two consecutive transformations, which would do: (ChannelsLast -> ChannelsFirst) -> (ChannelsFirst -> ChannelsLast) Or more concrete, the first converts to channels first and the second to channels last.

apply(model)

qonnx.transformation.create_generic_partitions

class qonnx.transformation.create_generic_partitions.PartitionFromDict(partitioning={}, partition_dir=None)

Bases: Transformation

Split a graph into partitions. Each resulting partition node has a model attribute indicating the path to the subordinate onnx file. Cleanup and InferShapes() transformations should be applied first.

This transformation builds on PartitionFromLambda() and takes a dictionary that defines partitions based on node indices.

Argument 0: partitioning * Dictionary with the following format: { partition_id : node_index_list } * Example: {0 : [3,4,5], 1 : range(10, 15)}

Argument 1 (optional): partition_dir * Manually define where to save the partition models

apply(model)
class qonnx.transformation.create_generic_partitions.PartitionFromLambda(partitioning=<function PartitionFromLambda.<lambda>>, partition_dir=None)

Bases: Transformation

Split a graph into partitions. Each resulting partition node has a model attribute indicating the path to the subordinate onnx file. Cleanup and InferShapes() transformations should be applied first.

Argument 0: partitioning * Function performing the mapping: node -> partition_id (int or string) * Partitions may not cover the graph completely (nodes mapped to -1 are retained) * Mapping must return -1 for GenericPartition nodes

Argument 1 (optional): partition_dir * Manually define where to save the partition models

apply(model)

qonnx.transformation.double_to_single_float

class qonnx.transformation.double_to_single_float.DoubleToSingleFloat

Bases: Transformation

Convert any float64 initializers to float32.

apply(model)

qonnx.transformation.expose_intermediate

class qonnx.transformation.expose_intermediate.ExposeIntermediateTensorsLambda(tensor_filter=<function ExposeIntermediateTensorsLambda.<lambda>>)

Bases: Transformation

apply(model: ModelWrapper)
class qonnx.transformation.expose_intermediate.ExposeIntermediateTensorsPatternList(pattern_list, dynamic_only=True)

Bases: ExposeIntermediateTensorsLambda

pattern_filter(tname, model)

qonnx.transformation.extend_partition

class qonnx.transformation.extend_partition.ExtendPartition(extend_index)

Bases: Transformation

Extends GenericPartition type nodes by inserting the graph pointed to by the model attribute. Argument 0: extend_index * List that contains the node indices of the GenericPartition nodes

apply(model)

qonnx.transformation.extract_conv_bias

class qonnx.transformation.extract_conv_bias.ExtractBiasFromConv

Bases: Transformation

Extracts the (optional) Bias from a Conv(Transpose) node and inserts it behind the Conv(Transpose) node as an Add node.

apply(model)

qonnx.transformation.extract_quant_scale_zeropt

class qonnx.transformation.extract_quant_scale_zeropt.ExtractQuantScaleZeroPt

Bases: Transformation

Extract any non-identity scale and zero-point Quant inputs as separate Div/Mul (for scale) and Add/Sub (for zeropoint” nodes, preceding and following the Quant node.

apply(model: ModelWrapper)

qonnx.transformation.fold_constants

class qonnx.transformation.fold_constants.FoldConstants(exclude_op_types=['Quant', 'BipolarQuant'])

Bases: Transformation

Replace the output of a node with const-only inputs with a precomputed result. Skip any op types given in exclude_op_types.

apply(model)
class qonnx.transformation.fold_constants.FoldConstantsFiltered(match_filter_fxn)

Bases: Transformation

Replace the output of a node with const-only inputs with a precomputed result. Use the match_filter_fxn(model, node) function to decide which nodes can be eligible for const folding.

apply(model)

qonnx.transformation.gemm_to_matmul

class qonnx.transformation.gemm_to_matmul.GemmToMatMul

Bases: Transformation

Converts Gemm nodes into a MatMul and an Add node. This transformation is built to support version 9 of the Gemm node, as documented here: https://github.com/onnx/onnx/blob/master/docs/Changelog.md#Gemm-9 However, earlier and later versions of the node are likely to work as well. Explicitly not supported is the optionality of input C in versions >=11 and the broadcast attribute of versions <=6.

apply(model)

qonnx.transformation.general

class qonnx.transformation.general.ApplyConfig(config, node_filter=<function ApplyConfig.<lambda>>)

Bases: Transformation

Applies node properties (attributes) from either a config dict or its JSON representation given as a filename. The JSON file can specify default values for particular op_types, as well as values for nodes with particular names. Example dict:

{
# set kernel_size = 3 for all nodes with op_type=Im2Col
"Defaults" : {"kernel_size" : [3, ["Im2Col"]]},
# set kernel_size = 7 for the particular node with name Im2Col_0
"Im2Col_0" : {"kernel_size" : 7}
}
apply(model)
class qonnx.transformation.general.ConvertDivToMul

Bases: Transformation

Convert divide by constant nodes to multiply by constant nodes.

apply(model)
class qonnx.transformation.general.ConvertSubToAdd

Bases: Transformation

Convert subtract-a-constant nodes to add-a-constant nodes.

apply(model)
class qonnx.transformation.general.GiveRandomTensorNames

Bases: Transformation

Give random tensor names to all tensors.

apply(model)
class qonnx.transformation.general.GiveReadableTensorNames

Bases: Transformation

Give more human-readable names to all internal tensors. You should apply GiveUniqueNodeNames prior to this transform to avoid empty node names, as the readable names are based on the node names.

apply(model)
class qonnx.transformation.general.GiveUniqueNodeNames(prefix='')

Bases: Transformation

Give unique names to each node in the graph using enumeration, starting with given prefix (if specified in the constructor).

apply(model)
class qonnx.transformation.general.GiveUniqueParameterTensors

Bases: Transformation

Make every parameter tensor unique. The aim is to avoid affecting other nodes apart from the one the system is currently operating on.

apply(model)
class qonnx.transformation.general.MovePadAttributeToTensor

Bases: Transformation

Move padding info from attribute into input tensor for Pad nodes.

apply(model)
class qonnx.transformation.general.RemoveStaticGraphInputs

Bases: Transformation

Remove any top-level graph inputs that have initializers.

apply(model)
class qonnx.transformation.general.RemoveUnusedTensors

Bases: Transformation

Remove any unused tensors in the graph by removing any initializers, ValueInfo and tensor annotations associated with it. Unused tensors do not appear as any input/output for any graph nodes.

apply(model)
class qonnx.transformation.general.SortGraph

Bases: Transformation

Returns the model with its node list sorted topologically. Any ONNX graph to be executed must have a topologically sorted node list, as dictated by the ONNX standard.

apply(model)

qonnx.transformation.infer_data_layouts

class qonnx.transformation.infer_data_layouts.InferDataLayouts

Bases: Transformation

Try to infer data layout annotations info for all input/intermediate/output tensors based on inputs and node type.

apply(model)

qonnx.transformation.infer_datatypes

class qonnx.transformation.infer_datatypes.InferDataTypes

Bases: Transformation

Infer QONNX DataType info for all intermediate/output tensors based on inputs and node type.

apply(model)
qonnx.transformation.infer_datatypes.infer_mac_result_dtype(idtypes, possible_negation)
qonnx.transformation.infer_datatypes.is_scaled_int(x)

qonnx.transformation.infer_shapes

class qonnx.transformation.infer_shapes.InferShapes

Bases: Transformation

Ensure every tensor in the model has a specified shape (ValueInfo).

apply(model)

qonnx.transformation.insert_topk

class qonnx.transformation.insert_topk.InsertTopK(k=5, axis=-1, largest=1, sorted=1)

Bases: Transformation

Add TopK node at the network output and replace the graph output with the TopK indices.

apply(model)

qonnx.transformation.lower_convs_to_matmul

class qonnx.transformation.lower_convs_to_matmul.LowerConvsToMatMul

Bases: Transformation

Replace Conv layers with pairs of Im2Col-MatMul layers, plus Transpose layers to keep the original data layout.

apply(model)

qonnx.transformation.make_input_chanlast

class qonnx.transformation.make_input_chanlast.MakeInputChannelsLast

Bases: Transformation

For networks with an input using the NCx data layout, add a transpose node at the beginning and mark the input as using NxC (channels-last).

apply(model)

qonnx.transformation.merge_onnx_models

class qonnx.transformation.merge_onnx_models.MergeONNXModels(pre_model)

Bases: Transformation

Merges two models. The model passed in the transformation will be inserted before the model the transformation is applied on, the resulting model is returned. This transformation will try to connect graph.output[0] of the pre model and graph.input[0] of the post model. If more than one input or output exists, a warning is raised.

apply(model)

qonnx.transformation.pruning

class qonnx.transformation.pruning.ApplyMasks(prune_spec: Dict)

Bases: Transformation

Apply the given sparsity masks in prune_spec to the appropriately named tensors in the model. These masks are only annotations, no actual pruning is performed at this stage.

apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
class qonnx.transformation.pruning.PropagateMasks(lossy: bool = True)

Bases: Transformation

Propagate the sparsity masks in the network to relevant upstream and downstream layers. Some inital sparsity masks must have been applied either manually or with the ApplyMasks transformation. Note that not all layer types are supported; see the update_node_mask function for details.

apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
class qonnx.transformation.pruning.PruneChannels(prune_spec: Dict, lossy: bool = True)

Bases: Transformation

Prune channels from specified tensors and their dependencies from a model, as specified by the dictionary given in prune_spec. This dictionary must be formatted as {tensor_name : {axis : {channels}}}. See test_pruning.py for examples. If lossy is True, the transformation will aggresively prune all relevant upstream/downstream layers around the specified tensors. This is good for maintaining the consistency of layer shapes, but may introduce a larger accuracy penalty. If lossy is False, the pruning will be more conservative to preserve the numerical ranges (e.g. biases won’t be pruned in the downstream layers) but this may lead to inconsistent shapes in the network.

apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
class qonnx.transformation.pruning.RemoveMaskedChannels(lossy: bool = True)

Bases: Transformation

Remove channels indicated by sparsity masks on the model. The sparsity mask annotations will be removed after they have been processed for each tensor. Does not perform any shape consistency checking and may result in a broken graph.

apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
qonnx.transformation.pruning.ensure_masktype_is_dict(mask)
qonnx.transformation.pruning.merge_dicts_of_sets(dict1, dict2)
qonnx.transformation.pruning.remove_masked_tensor_channels(tensor_or_shape, mask, axis)
qonnx.transformation.pruning.update_node_mask(node, masks_in, masks_out, lossy=True)

qonnx.transformation.qcdq_to_qonnx

class qonnx.transformation.qcdq_to_qonnx.QCDQToQuant

Bases: Transformation

Fuse a chain of nodes, specifically QuantizeLinear+DequantizeLinear back into QONNX Quant node. This transform finds chains of QuantizeLinear followed by DequantizeLinear during the quantization process into a QONNX Quant node. If a Clip node is found between the QuantizeLinear+DequantizeLinear, this will be taken into account for the Quant bitwidth calculation. Input —– A model potentially quantized with QuantizeLinear, (optional) Clip and DequantizeLinear nodes. Output —— A model with QuantizeLinear, Clip and DequantizeLinear nodes re-fused back into QONNX Quant nodes.

apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
qonnx.transformation.qcdq_to_qonnx.extract_elem_type(elem_type: int, clip_range=None) Tuple[int, int, bool]

Return Quant attribute specification based on element type and (optional) clipping range. Returns: (bitwidth, signed, is_narrow_qnt)

qonnx.transformation.qonnx_to_qcdq

class qonnx.transformation.qonnx_to_qcdq.QuantToQCDQ

Bases: Transformation

Replace QONNX Quant-style quantization nodes with QuantizeLinear -> Clip -> DequantizeLinear (QCDQ)-style quantization nodes. The following restictions apply on the Quant: - the scale, zero-point and bitwidth inputs for Quant must be statically specified

by an initializer

  • the bitwidth must be an integer in the range [2, 8]

  • the zero-point tensor must be zero

  • the scale must be a scalar value or 1D tensor

  • the rounding_mode attribute must be ROUND

BipolarQuant is not (yet) supported.

apply(model: ModelWrapper)

qonnx.transformation.quant_constant_folding

class qonnx.transformation.quant_constant_folding.FoldTransposeIntoQuantInit

Bases: Transformation

Fuses a Transpose node into the initializers of a Quant node.

apply(model: ModelWrapper)
qonnx.transformation.quant_constant_folding.is_quant_init(node: NodeProto, model: ModelWrapper)

qonnx.transformation.quantize_graph

class qonnx.transformation.quantize_graph.QuantizeGraph(quantnode_map)

Bases: Transformation

This transformation can be used to introduce a Quant node for a specific type of node in the graph. Users would be able to specify the location of the quant node by providing the input and output index as the parameters.

  1. Expectations:
    1. Onnx model in the modelwraper format.

    2. Model must be cleaned using qonnx.util.cleanup.cleanup_model()

    3. Batchsize to be set.

  2. Steps to transform are:

    Step1: Finding the input for the quant node. Step2: Finding the consumer of the quant node output. Step3: Finding the shape for the output tensor of quant node. Note: The output tensor of the quant node must have the same shape as the consumer of the input

    to the quant node.

  3. Input:

    A dict “quantnode_map” specifying the criterion, positions, and input parameters like scale, bitwidth, zeropoint, and others for a specific quantnode.

    Criterion:
    1. name: This will allow users to add quant nodes for specific node like “Conv_0” and “Gemm_0”.

      Note: using this users can have quant nodes with different parameters. Ex: quantizing “Conv_0” and “Conv_1” with bitwidth of 4 and 6, respectively.

    2. op_type: This will allow users to add quant nodes for all nodes of a particular op_type such

      as, “Conv”, “Gemm”, and others. Note: All quant nodes created using op_type criterion will have the same input parameters (scale, zeropoint, bitwidth, and others.)

    3. name and op_type: In this case, quant nodes will be added with precedence to “Name”

      in comparison to “op_type”.

    Positions: (“input”, index) or (“output”, index)
    1. “input”: indicates that the user want to quantize the input of the selected node.

    2. “output”: indicates that the user want to quantize the output of the selected node.

    3. index: refers to the input/output index to quantize (a node can have multiple inputs and outputs)

    Parameters (to quant node) are provided as (scale, zeropoint, bitwidth, narrow, signed, rounding_mode)

    1. Inputs: scale, zeropoint, bitwidth.

    2. Attributes: narrow, signed, rounding_mode.

  4. Assert:
    1. The input is a dictionary representing the node names as keys and a list of quant positions as values.

    2. The input dictionary must have atleast one mac node (Conv, gemm, matmul) for the transformation.

  5. Return:

    Returns a model with new quant nodes created at the positions specified using the “quantnode_map”.

  6. Example:
    quantnode_map = {“name”: {“Conv_0”: [((“input”, 0), (1, 0, 8, 0, 1, “ROUND”)),

    ((“input”, 1), (1, 0, 8, 0, 1, “ROUND”)), ((“output”, 0), (1, 0, 8, 0, 1, “ROUND”))],

    “Conv_1”: [((“input”, 0), (1, 0, 8, 0, 1, “ROUND”))], “Conv_2”: [((“input”, 1), (1, 0, 8, 0, 1, “ROUND”)),

    ((“output”, 0), (1, 0, 8, 0, 1, “ROUND”))]},

    “op_type”: {“Gemm”: [((“input”, 0), (1, 0, 8, 0, 1, “ROUND”)),

    ((“input”, 1), (1, 0, 8, 0, 1, “ROUND”)), ((“input”, 2), (1, 0, 8, 0, 1, “ROUND”)), ((“output”, 0), (1, 0, 8, 0, 1, “ROUND”))]}}

apply(model)
qonnx.transformation.quantize_graph.adjust_graph(model, input_positions, node_name, quantized_nodes)
qonnx.transformation.quantize_graph.create_quantnode(model, quantnode_input, quantnode_output_shape, scale_value, zeropoint_value, bitwidth_value, narrow, signed, rounding_mode)

qonnx.transformation.rebalance_conv

class qonnx.transformation.rebalance_conv.RebalanceIm2Col(extract_channels)

Bases: Transformation

For certain hardware that prefers channel parallelism over feature map spatial parallelism, it is possible to reshape the inputs to an Im2Col node to move some of the spatial dimension into the channels dimension. This transformation attempts to find such Im2Col nodes, adds a Reshape node in front and alters their kernel/stride sizes accordingly. See list of conditions checked in the implementation for a full list, but one example of rebalancing is provided in the unit test for this transformation (test_rebalance_conv.py)

apply(model)

qonnx.transformation.remove

class qonnx.transformation.remove.RemoveIdentityOps(atol=1e-05)

Bases: Transformation

Remove identity ops like Add/Sub with zero or Mul/Div with one. A tolerance value (defaults to 1e-05) can be specified during init for the comparison to zero/one.

apply(model)
class qonnx.transformation.remove.RemoveUnusedNodes

Bases: Transformation

Remove nodes which do not contribute to any top-level output in the graph, either directly or indirectly.

apply(model: ModelWrapper)
qonnx.transformation.remove.remove_node_and_rewire(model, node)

qonnx.transformation.resize_conv_to_deconv

class qonnx.transformation.resize_conv_to_deconv.ResizeConvolutionToDeconvolution(maintain_bit_width: bool = False)

Bases: Transformation

Replaces resize convolution layers (e.g., nearest neighbor upsample + same-padded convolution) with deconvolution layers using the weight convolution algorithm. Currently does not support resize convolutions that use bilinear or bicubic upsampling

apply(model)

qonnx.transformation.subpixel_to_deconv

class qonnx.transformation.subpixel_to_deconv.SubPixelToDeconvolution

Bases: Transformation

Replaces sub-pixel convolution layers (i.e., same-padded convolution + depth2space) with deconvolution layers using the weight shuffle algorithm. Currently does not support same-padded convolutions with biases.

apply(model)

finn.transformation.move_reshape

class finn.transformation.move_reshape.RemoveCNVtoFCFlatten

Bases: Transformation

Removes a flatten node if it is between two fpgadataflow nodes. For an NHWC-Conv to FC transition, the preceding transpose is absorbed. The flatten operation can also be implemented by a reshape node.

apply(model)