Releases: Xilinx/Vitis-AI
Vitis AI 3.5 Release
Release Notes 3.5
Version Compatibility
Vitis™ AI v3.5 and the DPU IP released with the v3.5 branch of this repository are verified as compatible with Vitis, Vivado™, and PetaLinux version 2023.1. If you are using a previous release of Vitis AI, you should review the version compatibility matrix for that release.
Documentation and Github Repository
- Merged UG1313 into UG1414
- Streamlined UG1414 to remove redundant content
- Streamlined UG1414 to focus exclusively on core tool usage. Core tools such as the Optimizer, Quantizer and Compiler are now being utilized across multiple targets (ie Ryzen™ AI, EPYC™) and this change seeks to make UG1414 more portable to these targets
- Migrated Adaptable SoC and Alveo specific content from UG1414 to Github.IO
- New Github.IO Toctree structure
- Integrated VART Runtime APIs in Doxygen format
Docker Containers and GPU Support
- Removed Anaconda dependency from TensorFlow 2 and PyTorch containers in order to address Anaconda commercial license requirements
- Updated Docker container to disable Ubuntu 18.04 support (which was available in Vitis AI but not officially supported). This was done to address CVE-2021-3493.
Model Zoo
- Add more classic models without modification such as YOLO series and 2D Unet
- Provide model info card for each model and Jupyter Notebook tutorials for new models
- New copyleft repo for GPL license models
ONNX CNN Quantizer
- Initial release
- This is a new quantizer that supports the direct PTQ quantization of ONNX models for DPU. It is a plugin built for the ONNXRuntime native quantizer.
- Supports power-of-two quantization with both QDQ and QOP format.
- Supports Non-overflow and Min-MSE quantization methods.
- Supports various quantization configurations in power-of-two quantization in both QDQ and QOP format.
- Supports signed and unsigned configurations.
- Supports symmetry and asymmetry configurations.
- Supports per-tensor and per-channel configurations.
- Supports ONNX models in excess of 2GB.
- Supports the use of the CUDAExecutionProvider for calibration in quantization.
PyTorch CNN Quantizer
- Pytorch 1.13 and 2.0 support
- Mixed precision quantization support, supporting float32/float16/bfloat16/intx mixed quantization
- Support of bit-wise accuracy cross check between quantizer and ONNX-runtime
- Split and chunk operators were automatically converted to slicing
- Dict input/output support for model forward function
- Keywords argument support for model forward function
- Matmul subroutine support
- Add support for BFP data type quantization
- QAT supports training on mutiple GPUs
- QAT supports operations with multiple inputs or outputs
TensorFlow 2 CNN Quantizer
- Updated to support Tensorflow 2.12 and Python 3.8.
- Adds support for quantizing subclass models.
- Adds support for mix precision, supports layer-wise data type configuration, supports float32, float16, bfloat16, and int quantization.
- Adds support for BFP datatypes, and add a new quantize strategy called 'bfp'.
- Adds support to quantize Keras nested models.
- Adds experimental support for quantizing the frozen pb format model in TensorFlow 2.x.
- Adds a new 'gpu' quantize strategy which uses float scale quantization and is used in GPU deployment scenarios.
- Adds support to exporting the quantized model to frozen pb format or onnx format.
- Adds support to exporting the quantized model with power-of-two scales to frozen pb format with "FixNeuron" inside, to be compatible with some compilers with pb format input.
- Adds support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes.
Bug Fixes:
- Fixes a gradient bug in the 'pof2s_tqt' quantize strategy.
- Fixes a bug of quantization position change introduced by the fast fine-tuning process after the PTQ.
- Fixes a graph transformation bug when a TFOpLambda op has multiple inputs.
TensorFlow 1 CNN Quantizer
- Adds support for fast fine-tuning that improves PTQ accuracy.
- Adds support for folding Reshape and ResizeNearestNeighbor operators.
- Adds support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes.
- Adds support for quantizing Sum, StridedSlice, and Maximum operators.
- Adds support for setting the input shape of the model, which is useful in the deployment of models with undefined input shapes.
- Adds support for setting the opset version in exporting onnx format.
Bug Fixes:
- Fixes a bug where the AddV2 operation is misunderstood as a BiasAdd.
Compiler
- New operators supported: Broadcast add/mul, Bilinear downsample, Trilinear downsample, Group conv2d, Strided-slice
- Performance improved on XV2DPU
- Error message improved
- Compilation time speed up
PyTorch Optimizer
- Removed requirement for license purchase
- Migrated to Github open-source
- Supports PyTorch 1.11, 1.12 and 1.13
- Supports pruning of grouped convolution
- Supports setting the number of channels to be a multiple of the specified number after pruning
TensorFlow 2 Optimizer
- Removed requirement for license purchase
- Migrated to Github open-source
- Supports TensorFlow 2.11 and 2.12
- Supports pruning of tf.keras.layers.SeparableConv2D
- Fixed tf.keras.layers.Conv2DTranspose pruning bug
- Supports setting the number of channels to be a multiple of the specified number after pruning
Runtime
- Supports Versal AI Edge VEK280 evalustion kit
- Buffer optimization for multi-batches to improve performance
- Add new tensor buffer interface to enhance zero copy
Vitis ONNX Runtime Execution Provider (VOE)
- Supports ONNX Opset version 18, ONNX Runtime 1.16.0 and ONNX version 1.13
- Supports both C++ and Python APIs(Python version 3)
- Supports VitisAI EP and other EPs to work together to deploy the model
- Provide Onnx examples based on C++ and Python APIs
- VitisAI EP is open source and upstreamed to ONNX public repo on Github
Library
- Added three new model libraries and support for five additional models
Model Inspector:
Support inspection for new DPU IPs
Profiler
- Added Profiler support for DPUCV2DX8G
DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge / Core)
- First general access release
- Configurable from C20B1 to C20B14
- Support most 2D operators required to deploy models found in the Model Zoo
- General support for the VE2802/VC2802 and V70
- Early access support for the VE2302 via this lounge
DPU IP - Zynq Ultrascale+ DPUCZDX8G
- IP has reached maturity
- No updates for this release
- No updated reference design (DPU TRD) will be published for minor (ie x.5) releases
- No updated pre-built board image will be published for minor (ie x.5) releases
DPU IP - Versal AIE Targets DPUCVDX8H
- IP has reached maturity
- No updates for this release
- No updated reference design (DPU TRD) will be published for minor (ie x.5) releases
- No updated pre-built board image will be published for minor (ie x.5) releases
DPU IP - CNN - Alveo Data Center DPUCVDX8G
- IP has reached maturity
- No updates for this release
- No updated reference design (DPU TRD) will be published for minor (ie x.5) releases
- No updated pre-built board image will be published for minor (ie x.5) releases
WeGO
- Enhanced WeGO to support V70 DPU GA release.
- Upgraded WeGO to provide support for PyTorch 1.13.1 and TensorFlow r2.12.
- Enhanced WeGO-Torch to support PyTorch 2.0 as a preview feature.
- Introduced new C++ API support for WeGO-Torch in addition to Python APIs.
- Implemented WeGO-TF1 and WeGO-TF2 as out-of-tree plugins.
Known Issues
- Engineering to add comments
AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Vitis AI 3.0 Release
Release Notes 3.0¶
Documentation and Github Repository¶
-
Migrated core documentation to Github.IO.
-
Incorporated offline HTML documentation for air-gapped users.
-
Restructured user documentation.
-
Restructured repository directory structure for clarity and ease-of-use.
Docker Containers and GPU Support¶
-
Migrated from multi-framework to per framework Docker containers.
-
Enabled Docker ROCm GPU support for quantization and pruning.
Model Zoo¶
-
Updated Model Zoo with commentary regarding dataset licensing restrictions
-
Added 14 new models and deprecated 28 models for a total of 130 models
-
Added super resolution 4x, as well as 2D and 3D semantic segmentation for Medical applications
-
Optimized models for benchmarks:
-
MLPerf: 3D-Unet
-
FAMBench: MaskRCNN
-
-
Provides optimized backbones supporting YoloX, v4, v5, v6 and EfficientNet-Lite
-
Ease-of-use enhancements, including replacing markdown-format performance tables with a downloadable Model Zoo spreadsheet
-
Added 72 PyTorch and TensorFlow models for AMD EPYC™ CPUs, targeting deployment with ZenDNN
-
Added models to support AMD GPU architectures based on ROCm and MLGraphX
TensorFlow 2 CNN Quantizer¶
-
Based on TensorFlow 2.10
-
Updated the Model Inspector to for improved accuracy of partitioning results expected from the DPU compiler.
-
Added support for datatype conversions for float models, including FP16, BFloat16, FP32, and double.
-
Added support for exporting quantized ONNX format models (to support the ONNX Runtime).
-
Added support for new layer types including SeparableConv2D and PReLU.
-
Added support for unsigned integer quantization.
-
Added support for automatic modification of input shapes for models with variable input shapes.
-
Added support to align the input and output quantize positions for Concat and Pooling layers.
-
Added error codes and improved the readability of the error and warning messages.
-
Various bug fixes.
TensorFlow 1 CNN Quantizer¶
-
Separated the quantizer code from the TensorFlow code, making it a plug-in module to the official TensorFlow code base.
-
Added support for exporting quantized ONNX format models (to support the ONNX Runtime).
-
Added support for datatype conversions for float models, including FP16, BFloat16, FP32 and double.
-
Added support for additional operations, including Max, Transpose, and DepthToSpace.
-
Added support for aligning input and output quantize positions of Concat and Pooling operations.
-
Added support for automatic replacement of Softmax with DPU-accelerated Softmax.
-
Added error codes and improved the readability of the error and warning messages.
-
Various bug fixes.
PyTorch CNN Quantizer¶
-
Support PyTorch 1.11 and 1.12.
-
Support exporting torch script format quantized model.
-
QAT supports exporting trained model to ONNX and torch script.
-
Support FP16 model quantization.
-
Optimized Inspector to support more pattern types, and backward compatible of device assignment.
-
Cover more PyTorch operators: more than 560 types of PyTorch operators are supported.
-
Enhanced parsing to support control flow parsing.
-
Enhanced message system with more useful message text.
-
Support fusing and quantization of BatchNorm without affine calculation.
Compiler¶
-
Added support for new operators, including: strided_slice, cost volume, correlation 1D & 2D, argmax, group conv2d, reduction_max, reduction_mean
-
Added support for Versal™ AIE-ML architectures DPUCV2DX8G (V70 and Versal AI Edge)
-
Focused effort to improve the intelligibility of error and partitioning messages
PyTorch Optimizer¶
-
Added support for fine-grained model pruning (sparsity)
-
OFA support for convolution layers with kernel sizes = (1,3) and dialation
-
OFA support for ConvTranspose2D
-
Added pruning configuration that allows users to specify pruning hyper-parameters
-
Specific exception types are defined for each type of error
-
Enhanced parallel model analysis with increased robustness
-
Support for PyTorch 1.11 and 1.12
TensorFlow 2 Optimizer¶
-
Added support for Keras ConvTranspose2D, Conv3D, ConvTranspose3D
-
Added support TFOpLambda operator
-
Added pruning configuration that allows users to specify pruning hyper-parameters
-
Specific exception types are defined for each type of error
-
Added support for TensorFlow 2.10
Runtime and Library¶
-
Added support for Versal AI Edge VEK280 evaluation kit and Alveo™ V70 accelerator cards (Early Access)
-
Added support for ONNX runtime, with eleven ONNX-specific examples
-
Added four new model libraries to the Vitis™ AI Library and support for fifteen additional models
-
Focused effort to improve the intelligibility of error messages
Profiler¶
-
Added Profiler support for DPUCV2DX8G (VEK280 Early Access)
-
Added Profiler support for Versal DDR bandwidth profiling
DPU IP - Zynq Ultrascale+ DPUCZDX8G¶
-
Upgraded to enable Vivado™ and Vitis 2022.2 release
-
Added support for 1D and 2D Correlation, Argmax and Max
-
Reduced resource utilization
-
Timing closure improvements
DPU IP - Versal AIE Targets DPUCVDX8G¶
-
Upgraded to enable Vivado and Vitis 2022.2 release
-
Added support for 1D and 2D Correlation
-
Added support for Argmax and Max along the channel dimension
-
Added support for Cost-Volume
-
Reduced resource utilization
-
Timing closure improvements
DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge)¶
-
Early access release supporting early adopters with an early, unoptimized AIE-ML DPU
-
Supports most 2D operators (currently does not support 3D operators)
-
Batch size support from 1~13
-
Supports more than 90 Model Zoo models
DPU IP - CNN - Alveo Data Center DPUCVDX8H¶
-
Upgraded to enable Vitis 2022.2 release
-
Timing closure improvements via scripts supplied for .xo workflows
DPU IP - CNN - V70 Data Center DPUCV2DX8G¶
-
Early access release supporting early adopters with an unoptimized DPU
-
Supports most 2D operators (currently does not support 3D operators)
-
Batch size 13 support
-
Supports more than 70 Model Zoo models
WeGO¶
-
Integrated WeGO with the Vitis-AI Quantizer to enable on-the-fly quantization and improve easy-of-use
-
Introduced serialization and deserialization with the WeGO flow to offer the capability of building once and running anytime
-
Incorporated AMD ZenDNN into WeGO, enabling additional optimization for AMD EPYC CPU targets
-
Improve WeGO robustness to offer a better developer experience and support a wider range of models
Known Issues¶
-
Bitstream loading error occurs when the AIE-ML DPU application running on the VEK280 kit is interrupted manually
-
HDMI not functional for the early access VEK280 image. The issue will be fixed in the next release
AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Vitis AI 2.5 Release
New Features/Highlights
AI Model Zoo added 14 new models, including BERT-based NLP, Vision Transformer (ViT), Optical Character Recognition (OCR), Simultaneous Localization and Mapping (SLAM), and more Once-for-All (OFA) models
Added 38 base & optimized models for AMD EPYC server processors
AI Quantizer added model inspector, now supports TensorFlow 2.8 and Pytorch 1.10
Whole Graph Optimizer (WeGO) supports Pytorch 1.x and TensorFlow 2.x
Deep Learning Processing Unit (DPU) for Versal® ACAP supports multiple compute units (CU), new Arithmetic Logic Unit (ALU) engine, Depthwise convolution and more operators supported by the DPUs on VCK5000 and Alveo™ data center accelerator cards
Inference server supports ZenDNN as backend on AMD EPYC™ server processors
New examples added to Whole Application Acceleration (WAA) for VCK5000 Versal development card and Zynq® UltraScale+™ evaluation kits
Release Notes
AI Model Zoo
Added 14 new models, and 134 models in total
Expanded model categories for diverse AI workloads :
Added models for data center application requirements including text detection and end-to-end OCR
Added BERT-based NLP and Vision Transformer (ViT) models on VCK5000
More OFA-optimized models, including OFA-RCAN for Super-Resolution and OFA-YOLO for Object Detection
Added models for industrial vision and SLAM, including Interest Point Detection & Description model and Hierarchical Localization model.
Added 38 base & optimized models for AMD EPYC CPU
EoU enhancement:
Improved model index by application categories
AI Quantizer-CNN
Added Model Inspector that inspects a float model and shows partition results
Support Tensorflow 2.8 and Pytorch 1.10
Support float-scale and per-channel quantization
Support configuration for different quantize strategies
AI Optimizer
OFA enhancement:
Support even kernel size of convolution
Support ConvTranspose2d
Updated examples
One-step and iterative pruning enhancement:
Resumed model analysis or search after exception
AI Compiler
Support ALU for DPUCZDX8G
Support new models
AI Library / VART
Added 6 new model libraries and support 17 new models
Custom Op Enhancement
Added new CPU operators
Xdputil Tool Enhancement
Two new demos on VCK190 Versal development board
AI Profiler
Full support on custom OP and Graph Runner
Stability optimization
Edge DPU-DPUCZDX8G
New ALU engine to replace pool engine and DepthWiseConv engine in MISC:
ALU: support new features, e.g. large-kernel-size MaxPool, AveragePool, rectangle-kernel-size AveragePool, 16bit const weights
ALU: support HardSigmoid and HardSwish
ALU: support DepthwiseConv + LeakyReLU
ALU: support the parallelism configuration
DPU IP and TRD on ZCU102 with encrypted RTL IP based on 2022.1 Vitis platform
Edge DPU-DPUCVDX8G
Optimized ALU that better support features like channel-attention
Support multiple compute units
Support DepthwiseConv + LeakyReLU
Support Versal DPU IP and TRD on VCK190 with encrypted RTL and AIE code which still in C32B1-6/C64B1-5, and based on 2022.1 Vitis platform
Cloud DPU-DPUCVDX8H
Enlarged DepthWise convolution kernel size that ranges from 1x1 to 8x8
Support AIE based pooling and ElementWise add & multiply, and big kernel size pooling
Support more DepthWise convolution kernel sizes
Cloud DPU-DPUCADF8H
Support ReLU6/LeakyReLU and MobileNet series models
Fixed the issue of missing directories in some cases in the .XO flow
Whole Graph Optimizer (WeGO)
Support PyTorch 1.x and TensorFlow 2.x in-framework inference
Added 19 PyTorch 1.x/Tensorflow 2.x/Tensorflow 1.x examples, including classification, object detection, segmentation and more
Inference Server
Added gRPC API to inference server flow
Support Tensorflow/Pytorch
Support AMD ZenDNN as backend
WAA
New examples for VCK5000 & ZCU104 - ResNet & adas_detection
New ResNet example containing AIE based pre-prorcessing kernel
Xclbin generation using Pre-built DPU flow for ZCU102/U50 ResNet and adas_detection applications
Xclbin generation using build flow for ZCU104/VCK190 ResNet and adas_detection applications
Porting of all VCK190 examples from ES1 to production version and use base platform instead of custom platform
Vitis AI 2.0 Release
Release 2.0
New Features/Highlights
- General Availability (GA) for VCK190(Production Silicon), VCK5000(Production Silicon) and U55C
- Add support for newer Pytorch and Tensorflow version: Pytorch 1.8-1.9, Tensorflow 2.4-2.6
- Add 22 new models, including Solo, Yolo-X, UltraFast, CLOCs, PSMNet, FairMOT, SESR, DRUNet, SSR as well as 3 NLP models and 2 OFA (Once-for-all) models
- Add the new custom OP flow to run models with DPU un-supported OPs with enhancement across quantizer, compiler and runtime
- Add more layers and configurations of DPU for VCK190 and DPU for VCK5000
- Add OFA pruning and TF2 keras support for AI optimizer
- Run inference directly from Tensorflow (Demo) for cloud DPU
Release Notes
Model Zoo
- 22 new models added, 130 total
- 19 new Pytorch models including 3 NLP and 2 OFA models
- 3 new Tensorflow models
- Added new application models
- AD/ADAS: Solo for instance segmentation, Yolo-X for traffic sign detection, UltraFast for lane detection, CLOCs for sensor fusion
- Medical: SESR for super resolution, DRUNet for image denoise, SSR for spectral remove
- Smart city and industrial vision: PSMNet for binocular depth estimation, FairMOT for joint detection and Re-ID
- EoU Enhancements
- Updated automatic script to search and download required models
Quantizer
- TF2 quantizer
- Add support TF 2.4-2.6
- Add support for custom OP flow, including shape inference, quantization and dumping
- Add support for CUDA 11
- Add support for input_shape assignment when deploying QAT models
- Improve support for TFOpLambda layers
- Update support for hardware simulation, including sigmoid layer, leaky_relu layer, global and non-global average pooling layer
- Bugfixs for sequential models and quantize position adjustment
- TF1 quantizer
- Add quantization support for new ops, including hard-sigmoid, hard-swish, element-wise multiply ops
- Add support for replacing normal sigmoid with hard sigmoid
- Update support for float weights dumping when dumping golden results
- Bugfixs for inconsistency of python APIs and cli APIs
- Pytorch quantizer
- Add support for pytorch 1.8 and 1.9
- Support CUDA 11
- Support custom OP flow
- Improve fast finetune performance on memory consumption and accuracy
- Reduce memory consumption by feature map among quantization
- Improve QAT functions including better initialization of quantization scale and new API for getting quantizer’s parameters
- Support more quantization of operations: some 1D and 3D ops, DepthwiseConvTranspose2D, pixel-shuffle, pixel-unshuffle, const
- Support CONV/BN merging in pattern of CONV+CONCAT+BN
- Some message enhancement to help user locate problem
- Bugfixs about consistency with hardware
Optimizer
- TensorFlow 1.15
- Support tf.keras.Optimizer for model training
- TensorFlow 2.x
- Support TensorFlow 2.3-2.6
- Add iterative pruning
- PyTorch
- Support PyTorch 1.4-1.9.1
- Support shared parameters in pruning
- Add one-step pruning
- Add once-for-all(OFA)
- Unified APIs for iterative and one-step pruning
- Enable pruned model to be used by quantizer
- Support nn.Conv3d and nn.ConvTranspose3d
Compiler
- DPU on embedded platforms
- Support and optimize conv3d, transposedconv3d, upsample3d and upsample2d for DPUCVDX8G(xvDPU)
- Improve the efficiency of high resolution input for DPUCVDX8G(xvDPU)
- Support ALUv2 new features
- DPU on Alveo/Cloud
- Support depthwise-conv2d, h-sigmoid and h-swish for DPUCVDX8H(DPUv4E)
- Support depthwise-conv2d for DPUCAHX8H(DPUv3E)
- Support high resolution model inference
- Support custom OP flow
AI Library and VART
- Support all the new models in Model Zoo: end-to-end deployment in Vitis AI Library
- Improved GraphRunner to better support custom OP flow
- Add examples on how to integrate custom OPs
- Add more pre-implemented CPU OPs
- DPU driver/runtime update to support Xilinx Device Tree Generator (DTG) for Vivado flow
AI Profiler
- Support CPU tasks tracking in graph runner
- Better memory bandwidth analysis in text summary
- Better performance to enable the analysis of large models
Custom OP Flow
- Provides new capability of deploying models with DPU unsupported OPs
- Define custom OPs in quantization
- Register and implement custom OPs before the deployment by graph runner
- Add two examples
- Pointpillars Pytorch model
- MNIST Tensorflow 2 model
DPU
- CNN DPU for Zynq SoC / MPSoC, DPUCZDX8G (DPUv2)
- Upgraded to 2021.2
- Update interrupt connection in Vivado flow
- CNN DPU for Alveo-HBM, DPUCAHX8H (DPUv3E)
- Support depth-wise convolution
- Support U55C
- CNN DPU for Alveo-DDR, DPUCADF8H (DPUv3Int8)
- Updated U200/U250 xlcbins with XRT 2021.2
- Released XO Flow
- Released IP Product Guide (PG400)
- CNN DPU for Versal, DPUCVDX8G (xvDPU)
- C32 (32-aie cores for a single batch) and C64 (64-aie cores for a single batch) configurable
- Support configurable batch size 1~5 for C64
- Support and optimize new OPs: conv3d, transposedconv3d, upsample3d and upsample2d
- Reduce Conv bubbles and compute redundancy
- Support 16-bit const weights in ALUv2
- CNN DPU for Versal, DPUCVDX8H (DPUv4E)
- Support depth-wise convolution with 6 PE configuration
- Support h-sigmoid and h-swish
Whole App Acceleration
- Upgrade to Vitis and Vivado 2021.2
- Custom plugin example: PSMNet using Cost Volume (RTL Based) accelerator on VCK190
- New accelerator for Optical Flow (TV-L1) on U50
- High resolution segmentation application on VCK190
- Options to compare throughput & accuracy between FPGA and CPU Versions
- Throughput improvements ranging from 25% to 368%
- Reorganized for better usability and visibility
TVM
- Add support of DPUs for U50 and U55C
WeGO (Whole Graph Optimizer)
- Run inference directly from Tensorflow framework for cloud DPU
- Automatically perform subgraph partitioning and apply optimization/acceleration for DPU subgraphs
- Dispatch non-DPU subgraphs to TensorFlow running on CPU
- Resnet50 and Yolov3 demos on VCK5000
Inference Server
- Support xmodel serving in cloud / on-premise (EA)
Known Issues
- vai_q_caffe hangs when TRAIN and TEST phases point to the same LMDB file
- TVM compiled Inception_v3 model gives low accuracy with DPUCADF8H (DPUv3Int8)
- TensorFlow 1.15 quantizer error in QAT caused by an incorrect pattern match
Vitis AI 1.4.1 Release
Release 1.4.1
New Features/Highlights
- Vitis AI RNN docker public release, including RNN quantizer and compiler
- New unified xRNN runtime for U25 & U50LV based on VART Runner interface and XIR xmodels
- Release Versal DPU TRD based on 2021.1
- Versal WAA app updated to provide better throughput using the new XRT C++ APIs and zero copy
- TVM easy-of-use improvement
- Support VCK190 and VCK5000 production boards
- Some bugs fixed, e.g. update on xCompiler data alignment issue affecting WAA, quantizer bug fixed
Vitis AI 1.4 Release
New Features/Highlights
- Support new platforms, including Versal ACAP platforms VCK190, VCK5000 and Kria SoM
- Better Pytorch and Tensorflow model support: Pytorch 1.5-1.7.1, improved quantization for Tensorflow 2.x models
- New models, including 4D Radar detection, Image-Lidar sensor fusion, 3D detection & segmentation, multi-task, depth estimation, super resolution for automotive, smart medical and industrial vision applications
- New Graph Runner API to deploy models with multiple subgraphs
- DPUCADX8G (DPUv1)deprecated with DPUCADF8H (DPUv3Int8)
- DPUCAHX8H (DPUv3E) and DPUCAHX8L (DPUv3ME) release with xo
- Classification & Detection WAA examples for Versal (VCK190)
v1.3.2 Release
- Enable Ubuntu 20.04 on MPSoC (Vitis AI Runtime and Vitis AI Library)
- Added environment variable for Vitis AI Library’s model search path
- Bug fixes for pytorch / LSTM and log improvement
Vitis-AI 1.3.1 Release
- Update compiler to improve performance by 5% in average for most models
- Added zero copy support (new APIs in VART / Vitis AI Library)
- Added cross-layer equalization support in TensorFlow v1.15
- Added WAA U50 TRD
- Updated U280 Pre-processing using Multi-preprocessing JPEG decode kernels
- Bug fixes and improvements for v1.3
Vitis-AI 1.3 Release
- Added support for Pytorch and Tensorflow 2.3 frameworks
- Added more ready-to-use AI models for a wider range of applications, including 3D point cloud detection and segmentation, COVID-19 chest image segmentation and other reference models
- Unified XIR-based compilation flow from edge to cloud
- Vitis AI Runtime (VART) fully open source
- New RNN overlay for NLP applications
- New CNN DPUs for the low-latency and higher throughput applications on Alveo cards
- EoU enhancement with Beta version model partitioning and custom layer/operators plug-in
Vitis-AI Release 1.2.1
v1.2.1
- Added Zynq Ultrascale Plus Whole App examples
- Updated U50 XRT and shell to Xilinx-u50-gen3x4-xdma-2-202010.1-2902115
- Updated docker launch instructions
- Updated TRD makefile instructions