Transformation - fpgadataflow

Transformations (fpgadataflow)

finn.transformation.fpgadataflow.annotate_cycles

class finn.transformation.fpgadataflow.annotate_cycles.AnnotateCycles

Bases: Transformation

Annotate the estimate of clock cycles per sample taken by each fpgadataflow node as an attribute on the node.

apply(model)

finn.transformation.fpgadataflow.annotate_resources

class finn.transformation.fpgadataflow.annotate_resources.AnnotateResources(mode, override_res_dict=None)

Bases: Transformation

Annotate the amount of FPGA resources taken by each fpgadataflow node as an attribute on the node, depending on the mode parameter: * ‘estimate’ – use the analytical estimation model * ‘hls’ – use results from the HLS synthesis report * ‘synth’ – use post-synthesis (Vivado or Vitis) report

No annotations can be provided unless the relevant transformation for the chosen mode (e.g. HLSSynthIP for hls) was previously run.

apply(model)

finn.transformation.fpgadataflow.cleanup

class finn.transformation.fpgadataflow.cleanup.CleanUp

Bases: Transformation

Remove any generated files for fpgadataflow nodes.

apply(model)

finn.transformation.fpgadataflow.compile_cppsim

class finn.transformation.fpgadataflow.compile_cppsim.CompileCppSim(num_workers=None)

Bases: NodeLocalTransformation

For every node: compile C++ code in node attribute “code_gen_dir_cppsim” and save path to executables in node attribute “executable_path”. All nodes in the graph must have the fpgadataflow backend attribute.

To use these executables, exec_mode must be set to “cppsim” (using transformation SetExecMode) and the model has to be executed using execute_onnx() from finn.core.onnx_exec

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

applyNodeLocal(node)

finn.transformation.fpgadataflow.convert_to_hw_layers

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferAddStreamsLayer

Bases: Transformation

Convert any Add into a AddStreams HW layer.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferBinaryMatrixVectorActivation

Bases: Transformation

Convert XnorPopcountMatMul layers to MatrixVectorActivation layers. Any immediately following MultiThreshold layers will also be absorbed into the MVTU.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferChannelwiseLinearLayer

Bases: Transformation

Convert any channel-wise Add/Mul into a HW layer.

apply(model)

get_smallest_possible(vals): Returns smallest (fewest bits) possible DataType that can represent value. Prefers unsigned integers where possible.

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferConcatLayer

Bases: Transformation

Convert suitable Concat nodes (operating on last/-1 axis) into StreamingConcat HW layers.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferConvInpGen

Bases: Transformation

Convert Im2Col layers to ConvolutionInputGenerator layers.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferDuplicateStreamsLayer

Bases: Transformation

Insert a DuplicateStreams HW layer for any tensor with fanout == 2

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferGlobalAccPoolLayer

Bases: Transformation

Convert any GlobalAveragePool into a GlobalAccPool HW layer and a scalar Mul.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferLabelSelectLayer

Bases: Transformation

Convert any TopK into a LabelSelect HW layer.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferLookupLayer

Bases: Transformation

Convert Gather nodes with constant op0 into Lookup HW layers.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferPool

Bases: Transformation

If kernel_shape > strides, replace Pool layer with with of Im2col + pool(with kernel_shape == strides), plus Transpose layers to keep the original data layout.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferQuantizedMatrixVectorActivation

Bases: Transformation

Convert MatMul layers with quantized inputs and weights to MatrixVectorActivation layers.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferStreamingEltwise

Bases: Transformation

Convert eltwise Sub or Sub -> Abs to StreamingEltwise layer with SubEltwise or AbsDiffEltwise op.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferStreamingMaxPool

Bases: Transformation

Convert MaxPoolNHWC layers to StreamingMaxPool HW layers.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferThresholdingLayer

Bases: Transformation

Convert any MultiThreshold into a standalone thresholding HLS layer.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferUpsample

Bases: Transformation

Convert Upsample and Resize nodes to layers to UpsampleNearestNeighbour nodes.

apply(model)

class finn.transformation.fpgadataflow.convert_to_hw_layers.InferVectorVectorActivation

Bases: Transformation

Convert MatMul layers with quantized inputs and weights to VectorVectorActivation layers, if the sparsity annotation of the weight matrix indicates that the MatMul layer belongs to a depthwise convolution. Any immediately following MultiThreshold layers will also be absorbed into the VVAU.

apply(model)

finn.transformation.fpgadataflow.create_dataflow_partition

class finn.transformation.fpgadataflow.create_dataflow_partition.CreateDataflowPartition(partition_model_dir=None)

Bases: Transformation

Split a graph into two graphs; one which contains non-FINN-dataflow nodes and a StreamingDataflowPartition node, and another which only contains FINN dataflow nodes. The StreamingDataflowPartition has a model attribute that indicates the filename for the second graph that only contains dataflow nodes. No action is taken if there are no dataflow nodes.

apply(model)

finn.transformation.fpgadataflow.create_stitched_ip

class finn.transformation.fpgadataflow.create_stitched_ip.CreateStitchedIP(fpgapart, clk_ns, ip_name='finn_design', vitis=False, signature=[])

Bases: Transformation

Create a Vivado IP Block Design project from all the generated IPs of a graph. All nodes in the graph must have the fpgadataflow backend attribute, and the PrepareIP transformation must have been previously run on the graph. The resulting block design is also packaged as IP. The transformation gets the fpgapart as a string.

Outcome if successful: sets the vivado_stitch_proj attribute in the ONNX ModelProto’s metadata_props field, with the created project dir as the value. A make_project.tcl script is also placed under the same folder, which is called to instantiate the per-layer IPs and stitch them together. The packaged block design IP can be found under the ip subdirectory.

apply(model)

connect_ap_none_external(node)

connect_axi(node)

connect_clk_rst(node)

connect_m_axis_external(node, idx=None)

connect_s_axis_external(node, idx=None)

insert_signature(checksum_count)

finn.transformation.fpgadataflow.create_stitched_ip.is_external_input(model, node, i)

finn.transformation.fpgadataflow.create_stitched_ip.is_external_output(model, node, i)

finn.transformation.fpgadataflow.derive_characteristic

class finn.transformation.fpgadataflow.derive_characteristic.DeriveCharacteristic(period, num_workers=None, manual_bypass=False)

Bases: NodeLocalTransformation

For each node in the graph, run rtlsim to obtain the i/o characteristic function for FIFO sizing and set the attribute. It is assumed that the PrepareRTLSim transformation was already called on the graph.

This transformation performs rtlsim for each node, so it will run for some time (minutes to hours depending on configuration).

period (int) desired period over which the characteristic function will be derived.
num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

apply(model: ModelWrapper)

applyNodeLocal(node)

class finn.transformation.fpgadataflow.derive_characteristic.DeriveFIFOSizes(num_workers=None, io_fifo_depth=32)

Bases: NodeLocalTransformation

Prerequisite: DeriveCharacteristic already called on graph. For each node in the graph, use the accumulated I/O characteristic function to perform FIFO sizing, setting the in/outFIFODepths attributes of HLSCustomOp nodes.

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

applyNodeLocal(node)

finn.transformation.fpgadataflow.externalize_params

class finn.transformation.fpgadataflow.externalize_params.ExternalizeParams

Bases: Transformation

Create top-level graph inputs for IODMAs serving layers where weights are marked as external using mem_mode=”external”.

apply(model)

finn.transformation.fpgadataflow.floorplan

class finn.transformation.fpgadataflow.floorplan.Floorplan(floorplan=None)

Bases: Transformation

Perform Floorplanning of the dataflow design:

floorplan: path to a JSON containing a dictionary with SLR assignments: for each node in the ONNX graph. Must be parse-able by the ApplyConfig transform.

The transform applies the properties in the supplied JSON then: -Separates DMAs into their own partitions IDs, -If not explicitly assigned, assigns DWCs to SLRs to minimize SLLs required -If not explicitly assigned, assigns FIFOs to the SLR of the upstream node

apply(model)

finn.transformation.fpgadataflow.hlssynth_ip

class finn.transformation.fpgadataflow.hlssynth_ip.HLSSynthIP(num_workers=None)

Bases: NodeLocalTransformation

For each HLS node: generate IP block from code in folder that is referenced in node attribute “code_gen_dir_ipgen” and save path of generated project in node attribute “ipgen_path”. All nodes in the graph must have the fpgadataflow backend attribute. Any nodes that already have a ipgen_path attribute pointing to a valid path will be skipped.

This transformation calls Vitis HLS for synthesis, so it will run for some time (minutes to hours depending on configuration).

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

applyNodeLocal(node)

finn.transformation.fpgadataflow.infer_pixel_padding_deconv

finn.transformation.fpgadataflow.insert_dwc

class finn.transformation.fpgadataflow.insert_dwc.InsertDWC

Bases: Transformation

Add data width converters between layers where necessary.

apply(model)

finn.transformation.fpgadataflow.insert_fifo

class finn.transformation.fpgadataflow.insert_fifo.InsertFIFO(create_shallow_fifos=False, max_qsrl_depth=None, vivado_ram_style='auto')

Bases: Transformation

Inserting FIFOs in the beginning and end of the graph as well as between fpgadataflow nodes.

Takes the setting for the depth from the surrounding nodes by extracting node attribute ‘outFIFODepths’ of the previous and node attribute ‘inFIFODepths’ of the subsequent node. max() of these two values sets the FIFO depth.

Constructor arguments:

Parameters:

max_qsrl_depth – FIFOs deeper than this will use Vivado IP instead of Verilog FIFOs (Q_srl.v)
vivado_ram_style – the StreamingFIFO.ram_style attribute to be used for large FIFOs implemented by Vivado
create_shallow_fifos – Normally, shallow-depth (<=2) FIFOs won’t be created since HLS streaming interfaces already have a degree of buffering. Override with this parameter.

The other node attributes necessary to create a FIFO node are taken from the node the FIFO node is inserted after: ‘folded_shape’ and ‘dtype’

apply(model)

finn.transformation.fpgadataflow.insert_hook

class finn.transformation.fpgadataflow.insert_hook.InsertHook

Bases: Transformation

Inserting hook layer after each layer that has the node attribute ‘output_hook’ specified

apply(model)

finn.transformation.fpgadataflow.insert_iodma

class finn.transformation.fpgadataflow.insert_iodma.InsertIODMA(max_intfwidth=32, insert_input=True, insert_output=True, insert_extmemw=True)

Bases: Transformation

Insert DMA nodes on inputs and outputs, or as specified by filters in the constructor.

apply(model)

get_mem_init(weights, pe, simd): Returns matrix ready for pack_innermost_dim_as_hex_string with reverse=False (finn.util.data_packing) to return the memory init file little endian packed. That is, get_mem_init returns: elem(pe,simd) addr = 0: [(pe-1,simd-1),(pe-1,simd-2),…(0,1),(0,0)] addr = 1: [(pe-1,simd*2-1),…….(0,simd+1),(0,simd)] .

finn.transformation.fpgadataflow.insert_tlastmarker

class finn.transformation.fpgadataflow.insert_tlastmarker.InsertTLastMarker(both=False, external=True, dynamic=True)

Bases: Transformation

Ensure that the graph is started/terminated with a TLastMarker_hls node, inserting one if necessary. Use constructor args to determine type of TLastMarker to be inserted. More information available on the TLastMarker documentation.

apply(model)

finn.transformation.fpgadataflow.make_pynq_driver

class finn.transformation.fpgadataflow.make_pynq_driver.MakePYNQDriver(platform)

Bases: Transformation

Create PYNQ Python code to correctly interface the generated accelerator, including data packing/unpacking. Should be called after conversion to HLS layers, folding and the creation of dataflow partitions for correct operation.

platform: one of [“zynq-iodma”, “alveo”]

Outcome if successful: sets the pynq_driver_dir attribute in the ONNX ModelProto’s metadata_props field, with the created driver dir as the value. If any layers use runtime-writable parameters, those will be gathered under the runtime_weights/ subfolder of the pynq_driver_dir.

apply(model)

finn.transformation.fpgadataflow.make_pynq_driver.to_external_tensor(init, w_dtype): Return an appropriately formatted and packed numpy byte array for given external parameter tensor.

finn.transformation.fpgadataflow.make_zynq_proj

class finn.transformation.fpgadataflow.make_zynq_proj.MakeZYNQProject(platform, enable_debug=False)

Bases: Transformation

Create a Vivado overlay project (including the shell infrastructure) from the already-stitched IP block for this graph. All nodes in the graph must have the fpgadataflow backend attribute, and the CreateStitchedIP transformation must have been previously run on the graph. This is functionally equivalent with MakePYNQProject but does not use Pynq infrastructure and instead creates a fully custom block design. However, this transform requires DMAs in the accelerator design.

Outcome if successful: sets the vivado_pynq_proj attribute in the ONNX ModelProto’s metadata_props field, with the created project dir as the value.

apply(model)

class finn.transformation.fpgadataflow.make_zynq_proj.ZynqBuild(platform, period_ns, enable_debug=False, partition_model_dir=None)

Bases: Transformation

Best-effort attempt at building the accelerator for Zynq. It assumes the model has only fpgadataflow nodes

apply(model)

finn.transformation.fpgadataflow.make_zynq_proj.collect_ip_dirs(model, ipstitch_path)

finn.transformation.fpgadataflow.minimize_accumulator_width

class finn.transformation.fpgadataflow.minimize_accumulator_width.MinimizeAccumulatorWidth

Bases: Transformation

For relevant nodes, call the accumulator width minimization functions to save on resources. May alter tensor DataType for certain nodes if they produce an accumulator as result.

apply(model)

finn.transformation.fpgadataflow.minimize_weight_bit_width

class finn.transformation.fpgadataflow.minimize_weight_bit_width.MinimizeWeightBitWidth

Bases: Transformation

For relevant nodes, call the weight bit width minimization functions to save on resources. May alter tensor weightDataType if the node does not have runtime writeable weights.

apply(model)

finn.transformation.fpgadataflow.prepare_cppsim

class finn.transformation.fpgadataflow.prepare_cppsim.PrepareCppSim(num_workers=None)

Bases: Transformation

Call custom implementation to generate code for single custom node and create folder that contains all the generated files. All nodes in the graph must have the fpgadataflow backend attribute.

Outcome if succesful: Node attribute “code_gen_dir_cppsim” contains path to folder that contains generated C++ code that can be used to simulate node using cppsim. The subsequent transformation is CompileCppSim

apply(model)

prepareCppSim_node(node)

finn.transformation.fpgadataflow.prepare_ip

class finn.transformation.fpgadataflow.prepare_ip.PrepareIP(fpgapart, clk)

Bases: Transformation

Call custom implementation to generate code for single custom node and create folder that contains all the generated files. All nodes in the graph must have the fpgadataflow backend attribute and transformation gets additional arguments:

fpgapart (string)
clk in ns (int)

Any nodes that already have a code_gen_dir_ipgen attribute pointing to a valid path will be skipped.

Outcome if succesful: Node attribute “code_gen_dir_ipgen” contains path to folder that contains:

For HLS layers: generated C++ code that can be used to generate a Vivado IP block. The necessary subsequent transformation is HLSSynthIP.
For RTL layers: filled template verilog files that can be used to instantiate as module during IP stitching.

apply(model)

finn.transformation.fpgadataflow.prepare_rtlsim

class finn.transformation.fpgadataflow.prepare_rtlsim.PrepareRTLSim(num_workers=None)

Bases: NodeLocalTransformation

For a graph with generated RTL sources (after HLSSynthIP), create a Verilator emulation library for each node to prepare for rtlsim execution and set the rtlsim_so property to the path to the generated emulation library.

To use these libraries, exec_mode must be set to “rtlsim” (using SetExecMode) and the model has to be executed using execute_onnx() from finn.core.onnx_exec

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

apply(model)

applyNodeLocal(node)

finn.transformation.fpgadataflow.replace_verilog_relpaths

class finn.transformation.fpgadataflow.replace_verilog_relpaths.ReplaceVerilogRelPaths

Bases: Transformation

Convert ./ relative file paths to absolute ones for generated Verilog

apply(model)

finn.transformation.fpgadataflow.set_exec_mode

class finn.transformation.fpgadataflow.set_exec_mode.SetExecMode(mode)

Bases: Transformation

Set attribute exec_mode in all fpgadataflow nodes to specify which kind of execution should be used (“cppsim” or “rtlsim”). Note that RTL components do not support cppsim. When cppsim is selected for RTL components, by default the execution of the HW op parent is executed.

apply(model)

finn.transformation.fpgadataflow.set_fifo_depths

class finn.transformation.fpgadataflow.set_fifo_depths.CapConvolutionFIFODepths(max_qsrl_depth=256)

Bases: Transformation

Make the size of FIFOs for convolution layers smaller where possible. Will be automatically called from InsertAndSetFIFODepths if the appropriate constructor flag is set.

Constructor arguments:

Parameters:: max_qsrl_depth – FIFOs deeper than this will use Vivado IP instead of Verilog FIFOs (Q_srl.v)

Assumed input graph properties:

all nodes are fpgadataflow nodes
FIFOs inserted with InsertAndSetFIFODepths

Output:

graph with smaller-depth FIFOs for convolutions

Background: The simulation-based rtlsim_exec tends to overestimate the required depth of FIFOs between the ConvolutionInputGenerator (here called SWG) and the MatrixVectorActivation (here called MVAU). As the SWG has an internal buffer of 1 image row, we use this as a rule of thumb to set FIFO depth to be no larger than 1 row.

apply(model)

class finn.transformation.fpgadataflow.set_fifo_depths.InsertAndSetFIFODepths(fpgapart, clk_ns=10.0, max_qsrl_depth=256, max_depth=None, swg_exception=False, vivado_ram_style='auto', force_python_sim=False)

Bases: Transformation

Insert appropriate-depth StreamingFIFOs through RTLSim that preserve throughput in the created accelerator.

Constructor arguments:

Parameters:

clk_ns – clock period (used for IP preparation)
max_qsrl_depth – FIFOs deeper than this will use Vivado IP instead of Verilog FIFOs (Q_srl.v)
max_depth – how deep the “max”-sized FIFOs initially inserted will be. If set to None, use the tensor size as the depth
swg_exception – call CapConvolutionFIFODepths to make convolution FIFOs smaller where appropriate
vivado_ram_style – the StreamingFIFO.ram_style attribute to be used for large FIFOs implemented by Vivado afterwards

Assumed input graph properties:

all nodes are fpgadataflow nodes
no FIFOs inserted,
(inFIFODepths/outFIFODepths attrs will be ignored)

Output:

graph with appropriate-depth FIFOs inserted

Background: Even with all FINN HLS fpgadatflow layers appropriately parallelized, it is necessary to insert FIFOs between them to prevent stalls due to bursty behavior. The sizes of those FIFOs are hard to predict analytically, so we do the following:

insert deep (=tensor size) FIFOs between all fpgadataflow nodes
create stitched design
run through rtlsim with stream of multiple random input images (to fill pipeline)
keep track of observed maximum occupancy for each FIFO during rtlsim
when sim finished, update each FIFO depth to maximum observed occupancy and set inFIFODepths/outFIFODepths attrs to that depth as well

apply(model)

class finn.transformation.fpgadataflow.set_fifo_depths.RemoveShallowFIFOs(shallow_threshold=0)

Bases: Transformation

Remove zero-depth FIFOs The threshold used to be 2 instead of 0, but with increasing number of FINN RTL components 2-depth FIFOs are still important for decoupling..

apply(model)

class finn.transformation.fpgadataflow.set_fifo_depths.SplitLargeFIFOs(max_qsrl_depth=256, max_vivado_depth=32768)

Bases: Transformation

Split large FIFOs before implementation, for two reasons:

impl_style=”vivado” supports a max depth of 32k. Any larger FIFOs must be implemented as a sequence of smaller FIFOs.
impl_style=”vivado” requires power-of-two depths, which is normally handled by rounding up to the nearest power-of-two. So a FIFO of size 8196 normally gets rounded-up to a depth of 16384 and wastes a lot of resources. Here, instead, we split this up into two FIFOs of depth 8192 + 4.

apply(model)

finn.transformation.fpgadataflow.set_fifo_depths.get_fifo_split_configs(depth, max_qsrl_depth=256, max_vivado_depth=32768): Break non-power-of-2 sized FIFO depths into several ones

finn.transformation.fpgadataflow.set_fifo_depths.get_signal(sim, keyw)

finn.transformation.fpgadataflow.set_fifo_depths.optimize_depth(depth)

finn.transformation.fpgadataflow.set_fifo_depths.reset_implementation(node)

finn.transformation.fpgadataflow.set_fifo_depths.set_signal(sim, keyw, value)

finn.transformation.fpgadataflow.set_folding

class finn.transformation.fpgadataflow.set_folding.SetFolding(target_cycles_per_frame=1000, mvau_wwidth_max=36, two_pass_relaxation=True)

Bases: Transformation

Attempt to set parallelism attributes in all nodes to meet a specific target expressed as cycles per frame target_cycles_per_frame. For each HLSCustomOp node type, the attribute may vary but is typically one of {PE, SIMD}, and has a certain allowed-maximum value and divisibility constraints, which SetFolding will take into account. Note that the algorithm implemented by SetFolding is very simple and it is often possible to hand-tune the returned parallelism configuration for better results.

In the returned model, each node’s cycles_estimate attribute will be set to its estimated number of cycles.

If two_pass_relaxation is enabled, SetFolding will internally run a second time if the target cycles from the first pass could not be achieved, instead using the achievable target (which may be constrained by a single node) to obtain a balanced pipeline.

Notable exceptions and special behavior:

When folding dense convolution/FC compute engines (“MVAU”/MatrixVectorActivation), which have two attributes (PE and SIMD):

first increases SIMD while weight stream width per PE is <= mvau_wwidth_max (configurable in the SetFolding initializer, defaults to 36)
then increases PE until the target is met or max PE reached

When folding depthwise convolutions (“VVAU”/VectorVectorActivation) or spatial reduction ops (Pool_Batch):

the producer of the node is expected to be a ConvolutionInputGenerator with depthwise=1, whose SIMD value will be set equal to the PE value of its consumer node
the VVAU also supports SIMD (“input window”) parallelism next to PE (“channels”), but current ConvInpGen limitations require PE to be fully unfolded before SIMD is increased

apply(model)

optimize_attribute_val(node_inst, max_val, attr_name)

finn.transformation.fpgadataflow.set_folding.divisors(num)

finn.transformation.fpgadataflow.specialize_layers

class finn.transformation.fpgadataflow.specialize_layers.SpecializeLayers(fpgapart='')

Bases: Transformation

Specialize all layers to either HLS or RTL variants

apply(model)

finn.transformation.fpgadataflow.synth_ooc

class finn.transformation.fpgadataflow.synth_ooc.SynthOutOfContext(part, clk_period_ns, clk_name='ap_clk')

Bases: Transformation

Run out-of-context Vivado synthesis on a stitched IP design.

apply(model)

finn.transformation.fpgadataflow.template_driver

finn.transformation.fpgadataflow.templates

finn.transformation.fpgadataflow.vitis_build

class finn.transformation.fpgadataflow.vitis_build.CreateVitisXO(ip_name='finn_design')

Bases: Transformation

Create a Vitis object file from a stitched FINN ip.

Outcome if successful: sets the vitis_xo attribute in the ONNX ModelProto’s metadata_props field with the name of the object file as value. The object file can be found under the ip subdirectory.

apply(model)

class finn.transformation.fpgadataflow.vitis_build.VitisBuild(fpga_part, period_ns, platform, strategy=VitisOptStrategy.PERFORMANCE, enable_debug=False, floorplan_file=None, enable_link=True, partition_model_dir=None)

Bases: Transformation

Best-effort attempt at building the accelerator with Vitis. It assumes the model has only fpgadataflow nodes

Parameters:

fpga_part – string identifying the target FPGA
period_ns – target clock period
platform – target Alveo platform, one of [“U50”, “U200”, “U250”, “U280”]
strategy – Vitis optimization strategy
enable_debug – add Chipscope to all AXI interfaces
floorplan_file – path to a JSON containing a dictionary with SLR assignments for each node in the ONNX graph. Must be parse-able by the ApplyConfig transform.
enable_link – enable linking kernels (.xo files), otherwise just synthesize them independently.

apply(model)

class finn.transformation.fpgadataflow.vitis_build.VitisLink(platform, f_mhz=200, strategy=VitisOptStrategy.PERFORMANCE, enable_debug=False)

Bases: Transformation

Create an XCLBIN with Vitis.

Outcome if successful: sets the bitfile attribute in the ONNX ModelProto’s metadata_props field with the XCLBIN full path as value.

apply(model)

class finn.transformation.fpgadataflow.vitis_build.VitisOptStrategy(value)

Bases: Enum

Values applicable to VitisBuild optimization strategy.

BUILD_SPEED = 'quick'

DEFAULT = '0'

PERFORMANCE = '2'

PERFORMANCE_BEST = '3'

POWER = '1'

SIZE = 's'