Builder

Modules

finn.builder.build_dataflow

class finn.builder.build_dataflow.StreamToLogger(logger, level)

Bases: object

Fake file-like stream object that redirects writes to a logger instance.

flush()

write(buf)

finn.builder.build_dataflow.build_dataflow_cfg(model_filename, cfg: DataflowBuildConfig)

Best-effort build a dataflow accelerator using the given configuration.

Parameters:

model_filename – ONNX model filename to build
cfg – Build configuration

finn.builder.build_dataflow.build_dataflow_directory(path_to_cfg_dir: str)

Best-effort build a dataflow accelerator from the specified directory.

Parameters:: path_to_cfg_dir – Directory containing the model and build config

The specified directory path_to_cfg_dir must contain the following files:

model.onnx : ONNX model to be converted to dataflow accelerator
dataflow_build_config.json : JSON file with build configuration

finn.builder.build_dataflow.main(): Entry point for dataflow builds. Invokes build_dataflow_directory using command line arguments

finn.builder.build_dataflow.resolve_build_steps(cfg: DataflowBuildConfig, partial: bool = True)

finn.builder.build_dataflow.resolve_step_filename(step_name: str, cfg: DataflowBuildConfig, step_delta: int = 0)

finn.builder.build_dataflow_config

class finn.builder.build_dataflow_config.AutoFIFOSizingMethod(value)

Bases: str, Enum

Select the type of automatic FIFO sizing strategy.

CHARACTERIZE = 'characterize'

LARGEFIFO_RTLSIM = 'largefifo_rtlsim'

class finn.builder.build_dataflow_config.DataflowBuildConfig(output_dir: str, synth_clk_period_ns: float, generate_outputs: List[DataflowOutputType], specialize_layers_config_file: str | None = None, folding_config_file: str | None = None, target_fps: int | None = None, folding_two_pass_relaxation: bool | None = True, verify_steps: List[VerificationStepType] | None = None, verify_input_npy: str | None = 'input.npy', verify_expected_output_npy: str | None = 'expected_output.npy', verify_save_full_context: bool | None = False, verify_save_rtlsim_waveforms: bool | None = False, stitched_ip_gen_dcp: bool | None = False, signature: List[int] | None = None, mvau_wwidth_max: int | None = 36, standalone_thresholds: bool | None = False, minimize_bit_width: bool | None = True, board: str | None = None, shell_flow_type: ShellFlowType | None = None, fpga_part: str | None = None, auto_fifo_depths: bool | None = True, split_large_fifos: bool | None = False, auto_fifo_strategy: AutoFIFOSizingMethod | None = AutoFIFOSizingMethod.LARGEFIFO_RTLSIM, force_python_rtlsim: bool | None = False, large_fifo_mem_style: LargeFIFOMemStyle | None = LargeFIFOMemStyle.AUTO, hls_clk_period_ns: float | None = None, default_swg_exception: bool | None = False, vitis_platform: str | None = None, vitis_floorplan_file: str | None = None, vitis_opt_strategy: VitisOptStrategyCfg | None = VitisOptStrategyCfg.DEFAULT, save_intermediate_models: bool | None = True, enable_hw_debug: bool | None = False, enable_build_pdb_debug: bool | None = True, verbose: bool | None = False, steps: List[Any] | None = None, start_step: str | None = None, stop_step: str | None = None, max_multithreshold_bit_width: int | None = 8, rtlsim_batch_size: int | None = 1, rtlsim_use_vivado_comps: bool | None = True)

Bases: object

Build configuration to be passed to the build_dataflow function. Can be serialized into or de-serialized from JSON files for persistence. See list of attributes below for more information on the build configuration.

auto_fifo_depths: bool | None = True: Whether FIFO depths will be set automatically. Involves running stitched rtlsim and can take a long time. If set to False, the folding_config_file can be used to specify sizes for each FIFO.

auto_fifo_strategy: AutoFIFOSizingMethod | None = 'largefifo_rtlsim': When auto_fifo_depths = True, select which method will be used for setting the FIFO sizes.

board: str | None = None: Target board, only needed for generating full bitfiles where the FINN design is integrated into a shell. e.g. “Pynq-Z1” or “U250”

default_swg_exception: bool | None = False: Call CapConvolutionFIFODepths in InsertAndSetFIFODepths transform to make convolution FIFOs smaller where appropriate

enable_build_pdb_debug: bool | None = True: Whether pdb postmortem debuggig will be launched when the build fails

enable_hw_debug: bool | None = False: Whether hardware debugging will be enabled (e.g. ILA cores inserted to debug signals in the generated hardware)

folding_config_file: str | None = None: (Optional) Path to configuration JSON file. May include parallelization, FIFO sizes, RAM and implementation style attributes and so on. If the parallelization attributes (PE, SIMD) are part of the config, this will override the automatically generated parallelization attributes inferred from target_fps (if any) Will be applied with qonnx.transformation.general.ApplyConfig

folding_two_pass_relaxation: bool | None = True: (Optional) Use two-pass relaxation for folding, only relevant if target_fps is set. If enabled, parallelization will internally run a second time if the target cycles from the first pass could not be achieved, instead using the achievable target to obtain a balanced pipeline. If disabled, this can be useful for decreasing the latency (even though throughput won’t increase).

force_python_rtlsim: bool | None = False: Avoid using C++ rtlsim for auto FIFO sizing and rtlsim throughput test if set to True, always using Python instead

fpga_part: str | None = None: Target Xilinx FPGA part. Only needed when board is not specified. e.g. “xc7z020clg400-1”

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

generate_outputs: List[DataflowOutputType]: Which output(s) to generate from the build flow. See documentation of DataflowOutputType for available options.

hls_clk_period_ns: float | None = None: Target clock frequency (in nanoseconds) for Vitis HLS synthesis. e.g. hls_clk_period_ns=5.0 will target a 200 MHz clock. If not specified it will default to synth_clk_period_ns

large_fifo_mem_style: LargeFIFOMemStyle | None = 'auto': Memory resource type for large FIFOs Only relevant when auto_fifo_depths = True

max_multithreshold_bit_width: int | None = 8: The optional argument max_multithreshold_bit_width affects which Quant nodes of the QONNX format get converted to the MultiThreshold nodes of FINN. This only affects Quant nodes in the activation path. Quant nodes, which define a bit width larger than max_multithreshold_bit_width are not converted to MultiThreshold nodes and a warning is raised instead. If not given max_multithreshold_bit_width defaults to 8.

minimize_bit_width: bool | None = True: (Optional) Whether optimizations that minimize the bit width of the weights and accumulator will be applied. Because this optimization relies on the the values of the weights, it will only be applied if runtime- writeable weights is not enabled.

mvau_wwidth_max: int | None = 36: (Optional) Control the maximum width of the per-PE MVAU stream while exploring the parallelization attributes to reach target_fps Only relevant if target_fps is specified. Set this to a large value (e.g. 10000) if targeting full unfolding or very high performance.

output_dir: str: Directory where the final build outputs will be written into

rtlsim_batch_size: int | None = 1: Override the number of inputs for rtlsim performance measurement.

rtlsim_use_vivado_comps: bool | None = True: If set to True, FIFOs with impl_style=vivado will be kept during rtlsim, otherwise they will be replaced by RTL implementations.

save_intermediate_models: bool | None = True: Whether intermediate ONNX files will be saved during the build process. These can be useful for debugging if the build fails.

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

shell_flow_type: ShellFlowType | None = None: Target shell flow, only needed for generating full bitfiles where the FINN design is integrated into a shell. See documentation of ShellFlowType for options.

signature: List[int] | None = None: Insert a signature node to the stitched-IP to read/write information to the design: e.g. Customer signature, application signature, version

specialize_layers_config_file: str | None = None: (Optional) Path to configuration JSON file in which user can specify a preferred implementation style (HLS or RTL) for each node. The SpecializeLayers transformation picks up these settings and if possible fulfills the desired implementation style for each layer by converting the node into its HLS or RTL variant. Will be applied with qonnx.transformation.general.ApplyConfig

split_large_fifos: bool | None = False: Whether FIFO nodes with depth larger than 32768 will be split. Allow to configure very large FIFOs in the folding_config_file.

standalone_thresholds: bool | None = False: (Optional) Whether thresholding layers (which implement quantized activations in FINN) will be implemented as stand-alone HW layers, instead of being part of MatrixVectorActivation layer. This gives larger flexibility, and makes it possible to have runtime-writable thresholds.

start_step: str | None = None: If given, start from this step, loading the intermediate model generated from the previous step (save_intermediate_models must be enabled)

steps: List[Any] | None = None: If given, only run the steps in the list. If not, run default steps. See default_build_dataflow_steps for the default list of steps. When specified: Each item can either be a string, or a function (does not apply to json serialized configs) and does the following: - strings are resolved to functions from the default list - functions are called with (model, DataflowBuildConfig) as args

stitched_ip_gen_dcp: bool | None = False: (Optional) Run synthesis to generate a .dcp for the stitched-IP output product. This can make it easier to treat it as a standalone artifact without requiring the full list of layer IP build directories. By default, synthesis will not run.

stop_step: str | None = None: If given, stop at this step.

synth_clk_period_ns: float: Target clock frequency (in nanoseconds) for Vivado synthesis. e.g. synth_clk_period_ns=5.0 will target a 200 MHz clock. If hls_clk_period_ns is not specified it will default to this value.

target_fps: int | None = None: (Optional) Target inference performance in frames per second. Note that target may not be achievable due to specific layer constraints, or due to resource limitations of the FPGA. If parallelization attributes are specified as part of folding_config_file that will override the target_fps setting here.

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

verbose: bool | None = False: When True, all warnings and compiler output will be printed in stdout. Otherwise, these will be suppressed and only appear in the build log.

verify_expected_output_npy: str | None = 'expected_output.npy': (Optional) Name of .npy file that will be used as the expected output for verification. Only required if verify_steps is not empty.

verify_input_npy: str | None = 'input.npy': (Optional) Name of .npy file that will be used as the input for verification. Only required if verify_steps is not empty.

verify_save_full_context: bool | None = False: (Optional) Save full execution context for each of the verify_steps. By default, only the top-level graph output is saved.

verify_save_rtlsim_waveforms: bool | None = False: (Optional) Save .vcd waveforms from rtlsim under reports. By default, waveforms won’t be saved.

verify_steps: List[VerificationStepType] | None = None: (Optional) At which steps the generated intermediate output model will be verified. See documentation of VerificationStepType for available options.

vitis_floorplan_file: str | None = None: Path to JSON config file assigning each layer to an SLR. Only relevant when shell_flow_type = ShellFlowType.VITIS_ALVEO Will be applied with qonnx.transformation.general.ApplyConfig

vitis_opt_strategy: VitisOptStrategyCfg | None = 'default': Vitis optimization strategy Only relevant when shell_flow_type = ShellFlowType.VITIS_ALVEO

vitis_platform: str | None = None: Which Vitis platform will be used. Only relevant when shell_flow_type = ShellFlowType.VITIS_ALVEO e.g. “xilinx_u250_xdma_201830_2” If not specified but “board” is specified, will use the FINN default (if any) for that Alveo board

class finn.builder.build_dataflow_config.DataflowOutputType(value)

Bases: str, Enum

Output product types that can be generated by build_dataflow

BITFILE = 'bitfile'

DEPLOYMENT_PACKAGE = 'deployment_package'

ESTIMATE_REPORTS = 'estimate_reports'

OOC_SYNTH = 'out_of_context_synth'

PYNQ_DRIVER = 'pynq_driver'

RTLSIM_PERFORMANCE = 'rtlsim_performance'

STITCHED_IP = 'stitched_ip'

class finn.builder.build_dataflow_config.LargeFIFOMemStyle(value)

Bases: str, Enum

Type of memory resource to use for large FIFOs.

AUTO = 'auto'

BRAM = 'block'

LUTRAM = 'distributed'

URAM = 'ultra'

class finn.builder.build_dataflow_config.ShellFlowType(value)

Bases: str, Enum

For builds that produce a bitfile, select the shell flow that will integrate the FINN-generated accelerator.

VITIS_ALVEO = 'vitis_alveo'

VIVADO_ZYNQ = 'vivado_zynq'

class finn.builder.build_dataflow_config.VerificationStepType(value)

Bases: str, Enum

Steps at which FINN ONNX execution can be launched for verification.

FOLDED_HLS_CPPSIM = 'folded_hls_cppsim': verify after step_apply_folding_config, using C++ for each HLS node

QONNX_TO_FINN_PYTHON = 'finn_onnx_python': verify after step_qonnx_to_finn, using Python execution

STITCHED_IP_RTLSIM = 'stitched_ip_rtlsim': verify after step_create_stitched_ip, using stitched-ip Verilog

STREAMLINED_PYTHON = 'streamlined_python': verify after step_streamline , using Python execution

TIDY_UP_PYTHON = 'initial_python': verify after step_tidy_up, using Python execution

class finn.builder.build_dataflow_config.VitisOptStrategyCfg(value)

Bases: str, Enum

Vitis optimization strategy with serializable string enum values.

BUILD_SPEED = 'quick'

DEFAULT = 'default'

PERFORMANCE = 'performance'

PERFORMANCE_BEST = 'performance_best'

POWER = 'power'

SIZE = 'size'

finn.builder.build_dataflow_config.default_build_dataflow_steps = ['step_qonnx_to_finn', 'step_tidy_up', 'step_streamline', 'step_convert_to_hw', 'step_create_dataflow_partition', 'step_specialize_layers', 'step_target_fps_parallelization', 'step_apply_folding_config', 'step_minimize_bit_width', 'step_generate_estimate_reports', 'step_hw_codegen', 'step_hw_ipgen', 'step_set_fifo_depths', 'step_create_stitched_ip', 'step_measure_rtlsim_performance', 'step_out_of_context_synthesis', 'step_synthesize_bitfile', 'step_make_pynq_driver', 'step_deployment_package']: List of steps that will be run as part of the standard dataflow build, in the specified order. Use the steps as part of build config to restrict which steps will be run.

finn.builder.build_dataflow_config.estimate_only_dataflow_steps = ['step_qonnx_to_finn', 'step_tidy_up', 'step_streamline', 'step_convert_to_hw', 'step_create_dataflow_partition', 'step_specialize_layers', 'step_target_fps_parallelization', 'step_apply_folding_config', 'step_minimize_bit_width', 'step_generate_estimate_reports']: List of steps to run for an estimate-only (no synthesis) dataflow build

finn.builder.build_dataflow_config.hw_codegen_dataflow_steps = ['step_qonnx_to_finn', 'step_tidy_up', 'step_streamline', 'step_convert_to_hw', 'step_create_dataflow_partition', 'step_specialize_layers', 'step_target_fps_parallelization', 'step_apply_folding_config', 'step_minimize_bit_width', 'step_generate_estimate_reports', 'step_hw_codegen']: List of steps to run for a dataflow build including HW code generation, but without any synthesis.

finn.builder.build_dataflow_steps

finn.builder.build_dataflow_steps.build_dataflow_step_lookup = {'step_apply_folding_config': <function step_apply_folding_config>, 'step_convert_to_hw': <function step_convert_to_hw>, 'step_create_dataflow_partition': <function step_create_dataflow_partition>, 'step_create_stitched_ip': <function step_create_stitched_ip>, 'step_deployment_package': <function step_deployment_package>, 'step_generate_estimate_reports': <function step_generate_estimate_reports>, 'step_hw_codegen': <function step_hw_codegen>, 'step_hw_ipgen': <function step_hw_ipgen>, 'step_make_pynq_driver': <function step_make_pynq_driver>, 'step_measure_rtlsim_performance': <function step_measure_rtlsim_performance>, 'step_minimize_bit_width': <function step_minimize_bit_width>, 'step_out_of_context_synthesis': <function step_out_of_context_synthesis>, 'step_qonnx_to_finn': <function step_qonnx_to_finn>, 'step_set_fifo_depths': <function step_set_fifo_depths>, 'step_specialize_layers': <function step_specialize_layers>, 'step_streamline': <function step_streamline>, 'step_synthesize_bitfile': <function step_synthesize_bitfile>, 'step_target_fps_parallelization': <function step_target_fps_parallelization>, 'step_tidy_up': <function step_tidy_up>}: map step name strings to step functions

finn.builder.build_dataflow_steps.prepare_for_stitched_ip_rtlsim(verify_model, cfg)

finn.builder.build_dataflow_steps.step_apply_folding_config(model: ModelWrapper, cfg: DataflowBuildConfig): Apply the folding configuration file onto the model to set folding (parallelization) and other attributes, if config file is specified.

finn.builder.build_dataflow_steps.step_convert_to_hw(model: ModelWrapper, cfg: DataflowBuildConfig): Convert eligible nodes to HWCustomOp subclasses that represent HW layers. Which nodes and particular configurations can be converted to HW is limited, see the source code of the convert_to_hw module for more. In the end am empty json file is created which can be used to set user specific preferred implementation styles for each node.

finn.builder.build_dataflow_steps.step_create_dataflow_partition(model: ModelWrapper, cfg: DataflowBuildConfig): Separate consecutive groups of HWCustomOp nodes into StreamingDataflowPartition nodes, which point to a separate ONNX file. Dataflow accelerator synthesis can only be performed on those HWCustomOp sub-graphs.

finn.builder.build_dataflow_steps.step_create_stitched_ip(model: ModelWrapper, cfg: DataflowBuildConfig): Create stitched IP for a graph after all HLS IP blocks have been generated. Depends on the DataflowOutputType.STITCHED_IP output product.

finn.builder.build_dataflow_steps.step_deployment_package(model: ModelWrapper, cfg: DataflowBuildConfig): Create a deployment package including the driver and bitfile.

finn.builder.build_dataflow_steps.step_generate_estimate_reports(model: ModelWrapper, cfg: DataflowBuildConfig): Generate per-layer resource and cycle estimates using analytical models.

finn.builder.build_dataflow_steps.step_hw_codegen(model: ModelWrapper, cfg: DataflowBuildConfig): Generate Vitis HLS code to prepare HLSBackend nodes for IP generation. And fills RTL templates for RTLBackend nodes.

finn.builder.build_dataflow_steps.step_hw_ipgen(model: ModelWrapper, cfg: DataflowBuildConfig): Run Vitis HLS synthesis on generated code for HLSBackend nodes, in order to generate IP blocks. For RTL nodes this step does not do anything.

finn.builder.build_dataflow_steps.step_make_pynq_driver(model: ModelWrapper, cfg: DataflowBuildConfig): Create a PYNQ Python driver that can be used to interface the generated accelerator.

finn.builder.build_dataflow_steps.step_measure_rtlsim_performance(model: ModelWrapper, cfg: DataflowBuildConfig): Measure performance + latency of stitched-IP model in rtlsim (pyverilator). Depends on the DataflowOutputType.STITCHED_IP output product.

finn.builder.build_dataflow_steps.step_minimize_bit_width(model: ModelWrapper, cfg: DataflowBuildConfig): Tighten the weight and accumulator bit widths for each layer.

finn.builder.build_dataflow_steps.step_out_of_context_synthesis(model: ModelWrapper, cfg: DataflowBuildConfig): Run out-of-context synthesis and generate reports. Depends on the DataflowOutputType.STITCHED_IP output product.

finn.builder.build_dataflow_steps.step_qonnx_to_finn(model: ModelWrapper, cfg: DataflowBuildConfig): This step will only execute if QONNX nodes are found. These include the following op_types: “Quant” , “Trunc” and “BinaryQuant”. If such nodes are found the step will run the tidy-up step from QONNX and then convert the QONNX model to the FINN-ONNX dialect.

finn.builder.build_dataflow_steps.step_set_fifo_depths(model: ModelWrapper, cfg: DataflowBuildConfig): Depending on the auto_fifo_depths setting, do one of the following: * if auto_fifo_depths=True: Run the appropriate auto-sizing transformation to attempt to determine the FIFO sizes that provide full throughput. May take a long time. * if auto_fifo_depths=False: Assume the folding config file contains FIFO sizes as well. Runs the InsertFIFO transformation, then ApplyConfig(cfg.folding_config_file), and finally RemoveShallowFIFOs. Coherency with config file node naming is ensured by calling GiveUniqueNodeNames.

finn.builder.build_dataflow_steps.step_specialize_layers(model: ModelWrapper, cfg: DataflowBuildConfig): Convert HW nodes to either an HLS or RTL variant of the node. HW nodes get converted either based on pre-determined rules (details can be found in specialize_layers source code) or the user provides a configuration file which contains the desired setting. If the user preference cannot be fulfilled, a warning will be printed and the implementation style will be set to a default.

finn.builder.build_dataflow_steps.step_streamline(model: ModelWrapper, cfg: DataflowBuildConfig): Run streamlining on given model. Streamlining involves moving floating point scale/shift parameters around, collapsing adjacent ones into a single parameter, then absorbing the scale/shift into the following MultiThreshold node. Streamlining requires careful topology design and cannot be applied to all topologies.

finn.builder.build_dataflow_steps.step_synthesize_bitfile(model: ModelWrapper, cfg: DataflowBuildConfig): Synthesize a bitfile for the using the specified shell flow, using either Vivado or Vitis, to target the specified board.

finn.builder.build_dataflow_steps.step_target_fps_parallelization(model: ModelWrapper, cfg: DataflowBuildConfig): If target_fps was specified, use the SetFolding transformation to determine parallelization attributes. The auto-generated config will be saved under auto_folding_config.json under the outputs, which can serve as a basis for customizing the folding factors further.

finn.builder.build_dataflow_steps.step_tidy_up(model: ModelWrapper, cfg: DataflowBuildConfig): Run the tidy-up step on given model. This includes shape and datatype inference, constant folding, and giving nodes and tensors better names.

finn.builder.build_dataflow_steps.verify_step(model: ModelWrapper, cfg: DataflowBuildConfig, step_name: str, need_parent: bool, rtlsim_pre_hook=None)