Releases: microsoft/onnxruntime-extensions
v0.11.0
What's changed
- Created Java packaging pipeline and published to Maven repository.
- Added support for conversion of Huggingface FastTokenizer into ONNX custom operator.
- Unified the SentencePiece tokenizer with other Byte Pair Encoding (BPE) based tokenizers.
- Fixed Whisper large model pre-processing bug.
- Enabled eager execution for custom operator and refactored the header file structure.
Contributions
Contributors to ONNX Runtime Extensions include members across teams at Microsoft, along with our community members: @sayanshaw24 @wenbingl @skottmckay @natke @hariharans29 @jslhcl @snnn @kazssym @YUNQIUGUO @souptc @yihonglyu
v0.12.0
What's Changed
- Added C APIs for language, vision and audio processors including new FeatureExtractor for Whisper model
- Support for Phi-3 Small Tokenizer and new OpenAI tiktoken format for fast loading of BPE tokenizers
- Added new CUDA custom operators such as MulSigmoid, Transpose2DCast, ReplaceZero, AddSharedInput and MulSharedInput
- Enhanced Custom Op Lite API on GPU and fused kernels for DORT
- Bug fixes, including null bos_token for Qwen2 tokenizer and SentencePiece converted FastTokenizer issue on non-ASCII characters, as well as necessary updates for MSVC 19.40 and numpy 2.0 release
New Contributors
- @yihonglyu made their first contribution in #702
- @skyline75489 made their first contribution in #748
Full Changelog: v.0.11.0...v0.12.0
v0.10.1
v0.10.0
What's changed
- Modified gen_processing_model tokenizer model to output int64, unifying output datatype of all tokenizers.
- Implemented support for post-processing of YOLO v8 within the Python extensions package.
- Introduced 'fairseq' flag to enhance compatibility with certain Hugging Face tokenizers.
- Incorporated 'added_token' attribute into the BPE tokenizer to improve CodeGen tokenizer functionality.
- Enhanced the SentencePiece tokenizer by integrating token indices into the output.
- Added support for the custom operator implemented with CUDA kernels, including two example operators.
- Added more tests on the Hugging Face tokenizer and fixed identified bugs.
Contributions
Contributors to ONNX Runtime Extensions include members across teams at Microsoft, along with our community members: @wenbingl @sayanshaw24 @skottmckay @mszhanyi @edgchen1 @YUNQIUGUO @RandySheriffH @samwebster @hyoshioka0128 @baijumeswani @dizcza @Craigacp @jslhcl
v0.9.0
What's Changed
- New Python API gen_processing_models to export ONNX data processing model from Huggingface Tokenizers such as LLaMA , CLIP, XLM-Roberta, Falcon, BERT, etc.
- New TrieTokenizer operator for RWKV-like LLM models, and other tokenizer operator enhancements.
- New operators for Azure EP compatibility: AzureAudioToText, AzureTextToText, AzureTritonInvoker for Python and NuGet packages.
- Processing operators have been migrated to the new Lite Custom Op API
- New operator of string strip
- Using the latest Ort header instead of minimum compatible headers
- Support offset mapping in most tokenizers like BERT, CLIP, Roberta and etc.
- Remove the deprecating std::codecvt_utf8 from code base
- Document are uploaded to https://onnxruntime.ai/docs/extensions/
Contributions
Contributors to ONNX Runtime Extensions include members across teams at Microsoft, along with our community members: @aidanryan-msft @RandySheriffH @edgchen1 @kunal-vaishnavi @sayanshaw24 @skottmckay @snnn @VishalX @wenbingl @wejoncy
v0.8.0
New Changes:
- NuGet package for the .NET platform. This package offers comprehensive platform support, including Windows, Linux, MacOS, Android, and iOS. Both x64 and arm64 architectures are supported, where applicable.
- Support for pre-processing and post-processing of the Whisper model, inclusive of Audio and Tokenizer decoding operators.
- Extends support for pre-processing and post-processing of object-detection models, including a new DrawBoundingBoxes operator. Pre/post processing tools can add non-max-suppression to the model to select the best bounding boxes, and scale those to the original image. See the end-to-end example in
tutorials/yolo_e2e.py
. - Introduces the Audio Domain, complemented with AudioCodec and optimized STFT Operators, enhancing audio processing capabilities.
- Enabled optional input/output support for some operators such as GPT2Tokenizer, ClipTokenizer, and RobertaTokenizer.
- Refined the implementation of offset mapping for BBPE-style tokenizers for more operators and efficiency improvement.
- Other bug and security fixes.
Contributions
Contributors to ONNX Runtime Extensions include members across teams at Microsoft, along with our community members: @edgchen1 @kunal-vaishnavi @sayanshaw24 @skottmckay @snnn @VishalX @wenbingl @wejoncy
Full Changelog: v0.7.0...v0.8.0
v0.7.0
General
1. New custom operators: RobertaTokenizer, ClipTokenizer, EncodeImage, DecodeImage
2. ORT custom operator C++ stub generation tool
3. Operator implementation and documentation improved.
4. Python (3.7 - 3.10) and ORT (1.10 above) compatible.
Mobile
1. Android package: Maven
2. iOS package: CocoaPods
3. PrePostProcessor tool for mobile model
4. Super-resolution model pre- / post- processing end-to-end examples
Contributors to this release include members across teams at Microsoft, along with our community members: @edgchen1 @skottmckay @shaahji @sayanshaw24 @snnn @wenbingl @natke @YUNQIUGUO @guschmue @JamieMagee @adrianlizarraga @wejoncy @matheusgomes28
v0.5.0
This is a C++ source code package only release. Python and other packages will be expected in the next release.
What's Changed
- Support OpenCV core and imgproc modules in Custom Op implementation.
- Code security compliance fixings.
- Some other improvements.
Thanks for the Contributors from: @joburkho @shaahji @TruscaPetre @Sanster @natke @hombreola @snnn @leqiao-1 @wenbingl