Skip to content

Releases: nvidia-holoscan/holoscan-sdk

v2.4.0

05 Sep 23:54
9856d17
Compare
Choose a tag to compare

Release Artifacts

See supported platforms for compatibility.

Release Notes

New Features and Improvements

  • The Holoscan CLI packager has been updated to create application containers that are up to 78% smaller than their previous standard container based size. The new --includes option allows the packager to include only runtime dependencies relevant to the application. Refer to the documentation for more information.

  • The Holoscan pipeline metadata feature introduced for the C++ API in release v2.3 is now also available from the Python API. Interaction with metadata can be done with an API very similar to Python's built-in dictionaries. Please see the dynamic application metadata section of the user guide for details.

  • The V4L example now supports a YUV configuration to display YUV input directly without RGB conversion.

  • Added support to build Holoscan SDK without docker cache in run script e.g., ./run build --no-cache

Core
  • It is now possible to set a HOLOSCAN_QUEUE_POLICY environment variable to override the default queue policy that is used by the input and output ports of the SDK. Valid options (case insensitive) are:

    • "pop": a new item that arrives when the queue is full replaces the oldest item
    • "reject": a new item that arrives when the queue is discarded
    • "fail": terminate the application if a new item arrives when the queue is full

    The default behavior remains "fail" if the environment variable is not specified. If an operator's setup method explicitly sets a receiver or transmitter via the IOSpec::connector method, this default value does not override the policy of that connector.

  • Improve the handling of extra arguments passed to the application via CLI.

    • Now calls the allow_extras() method to permit extra arguments instead of ignoring them on an ExtrasError in the CLI11 parser.
  • The SDK no longer spawns a new process to check for unused network ports for UCX communication when running a distributed application. Previously, this process caused issues such as redundant system resource consumption when the Holoscan distributed application was run as part of a larger application (e.g., importing Holoscan as a Python module after importing other modules) because a new process was created solely for checking unused network ports. This issue has been resolved by performing the network port check in in-process mode.

Operators/Resources
  • Two new operators useful for examples and testing were added. PingTensorTxOp will emit a TensorMap containing a single tensor with user-specified name, shape, data type and storage type (e.g. host vs. device). PingTensorRxOp will receive a message containing a TensorMap and print some attributes of any tensors contained within it. Versions of these previously existed in examples and test code, but have now been moved to a common public location (holoscan::ops namespace for C++ and under holoscan.operators for Python).
  • The V4L2VideoCaptureOp now supports passing the input buffer unmodified to the output. This can be enabled by using the parameter pass_through, by default this is disabled.
  • Handle/enhance various cases of multi-receiver input ports (holoscan::InputContext::receive<std::vector<T>>())
    • Support receiving an array of TensorMap items from the input port.
    • Improve handling of cases where no data or null pointers are received from the input port.
    • Throw an invalid argument exception if the operator attempts to receive non-vector data (op_input.receive<T>()) from an input port with a queue size of IOSpec::kAnySize.
    • Avoid using nvidia::TypenameAsString for the type name in error messages, as it may include characters that are not permitted in the message (e.g., {anonymous}), which could be interpreted as a format specifier. This can result in an exception being thrown during message formatting.
  • The HolovizOp now supports YUV (aka YCbCr) images as input. Various 420 and 422 formats are supported.
    • New image formats:
      • y8u8y8v8_422_unorm
      • u8y8v8y8_422_unorm
      • y8_u8v8_2plane_420_unorm
      • y8_u8v8_2plane_422_unorm
      • y8_u8_v8_3plane_420_unorm
      • y8_u8_v8_3plane_422_unorm
      • y16_u16v16_2plane_420_unorm
      • y16_u16v16_2plane_422_unorm
      • y16_u16_v16_3plane_420_unorm
      • y16_u16_v16_3plane_422_unorm
    • YUV color model conversions:
      • yuv_601
      • yuv_709
      • yuv_2020
    • YUV ranges:
      • itu_full
      • itu_narrow
    • Chroma locations in x and y:
      • cosited_even
      • midpoint
  • The AJASourceOp now supports the following video formats:
    • 720p @ 50, 59.94, 60Hz
    • 1080i @ 50, 59.94, 60Hz
    • 1080p @ 23.98, 24, 25, 29.97, 30, 50, 59.94, 60Hz
    • 3840x2160 (UHD) @ 23.98, 24, 25, 29.97, 30, 50, 59.94, 60Hz
    • 4096x2160 (4K) @ 23.98, 24, 25, 29.97, 30, 50, 59.94, 60Hz
Holoviz module
  • Now supports YUV (aka YCbCr) images and YUV conversion parameters. The functions to specify image layer data have been extended to support planar formats.
    • New entry point ImageYuvModelConversion() to specify the YUV model conversion (BT.601, BT.709, BT.2020)
    • New entry point ImageYuvRange() to specify the YUV range (ITU full and ITU narrow)
    • New entry point ImageChromaLocation() to specify the chroma location (cosited even and midpoint)
Utils
  • An aja_build.sh script was added to automate the download, build, and loading of the AJA NTV2 drivers and SDK.
HoloHub
Documentation

Breaking Changes

Bug fixes

Issue Description
- Holoviz operator fails with Surface format '29, 0' not supported when enabling sRGB framebuffer in headless mode.
- Fixed a bug where the run vscode --parallel <num_workers> command was not working as expected, displaying the message arg: unbound variable.
4791938 v4l_camera doesn't work with the USB camera when 800x600 is set, and there are multiple sizes available for width 800.
4792457 Heap memory error was found in GXFParameterAdaptor with AddressSanitizer (ASAN) during dynamic analysis.
4752615 In Python, the operator's parameter values are not available in the initialize() method. This bug was introduced in version 2.1.0.
4510522 V4L2VideoCaptureOp does not work with RGB.

Known Issues

This section supplies details about issues discovered during development and QA but not resolved in this release.

Issue Description
4062979 When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered.
4267272 AJA drivers cannot be built with RDMA on IGX SW 1.0 DP iGPU due to missing nv-p2p.h. Expected to be addressed in IGX SW 1.0 GA.
4384768 No RDMA support on JetPack 6.0 DP and IGX SW 1.0 DP iGPU due to missing nv-p2p kernel module. Expected to be addressed in JP 6.0 GA and IGX SW 1.0 GA respectively.
4190019 Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run. Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead.
4210082 v4l_camera example seg faults at exit.
4339399 High CPU usage observed with video_replayer_distributed application. While the high CPU usage associated with the GXF UCX extension has been fixed since v1.0, distributed applications using the MultiThreadScheduler (with the check_recession_period_ms parameter set to 0 by default) may still experience high CPU usage. Setting the HOLOSCAN_CHECK_RECESSION_PERIOD_MS environment variable to a value greater than 0 (e.g. 1.5) can help reduce CPU usage. However, this may result in increased latency for the application until the MultiThreadScheduler switches to an event-based multithreaded scheduler.
4318442 UCX cuda_ipc protocol doesn't work in Docker containers on x86_64. As a workaround, we are currently disabling the UCX cuda_ipc protocol on all platforms via the UCX_TLS environment variable.
4325468 The V4L2VideoCapture operator only supports YUYV and AB24 source pixel formats, and only outputs the RGBA GXF video format. Other source pixel formats compatible with V4L2 can be manually defined by the user, but they're assumed to be equivalent to RGBA8888.
4325585 Applications using MultiThreadScheduler may exit early due to timeouts. This occurs when the stop_on_deadlock_timeout parameter is improperly set to a value equal to or less than check_recession_period_ms, particularly if check_recession_period_ms is greater than zero.
4301203 HDMI IN fails in v4l2_camera on IGX Orin Devkit for some resolution or formats. Try the latest firmware as a partial fix. Driver-level fixes expected in IGX SW 1.0 GA.
4384348 UCX termination (either ctrl+c , press 'Esc' or clicking close button) is not smooth and can show multiple error messages.
4481171 Running the driver for a distributed applications on IGX Orin devkits fails when connected to other systems through eth1. A workar...
Read more

v2.3.0

05 Aug 15:23
Compare
Choose a tag to compare

Release Artifacts

See supported platforms for compatibility.

Release Notes

New Features and Improvements

Core
  • Explicitly delete the copy constructor and assignment operator of Config/Executor/Graph classes private to prevent copying of these objects. This change ensures that the Config/Executor/Graph classes are not copied, as they are not intended to be copied, which is inefficient. This change is backward compatible, as the classes are still movable.
    • [Internal] Application::fragment_graph_ is now a std::shared_ptr to FragmentGraph to prevent copying of FragmentGraph objects in Python bindings.
    • [Internal] Fragment::graph_ is now a std::shared_ptr to OperatorGraph to prevent copying of OperatorGraph objects in Python bindings.
    • [Internal] Added Fragment::config_shared()/Fragment::executor_shared()/Fragment::graph_shared() to return a shared pointer to the Config/Executor/OperatorGraph objects respectively.
Operators/Resources
  • A new decorator, holoscan.decorator.create_op is provided that can wrap an existing function or generator as a native Python Operator. This new API is still considered experimental and may be updated in a subsequent release based on initial feedback. An example of using this decorator is provided under examples/python_decorator/video_replayer.py as well as in the test applications within python/tests/system/test_decorator_apps.py.
Utils
HoloHub
Documentation

Breaking Changes

Bug fixes

Issue Description
4687735 Fixed a bug where the CountCondition in Python doesn't accept a negative value for the count parameter, even though it is allowed in C++. Note that using a negative value for count is not recommended as it would lead to an infinite loop until it reaches the minimum value of the data type (int64_t) and then starts counting down from the maximum value up to zero.
4689604 Fixed a bug where GXF extensions listed in the config file (YAML) are not loaded when GXFCodeletOp or GXFComponentResource is used. Test cases have been added to verify the fix, and the documentation has been updated to reflect the changes. The HOLOSCAN_WRAP_GXF_COMPONENT_AS_RESOURCE macro has also been updated to support the constructor with no arguments.
4706559 Fixed a bug where nullptr was returned even when no message was available in calls to holoscan::IOContext::receive<T*>() or holoscan::IOContext::receive<std::shared_ptr<T>> in the C++ API. Now, the receive method correctly returns a holoscan::unexpected<holoscan::RuntimeError> value when no message is available.

Known Issues

This section supplies details about issues discovered during development and QA but not resolved in this release.

Issue Description
4062979 When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered.
4267272 AJA drivers cannot be built with RDMA on IGX SW 1.0 DP iGPU due to missing nv-p2p.h. Expected to be addressed in IGX SW 1.0 GA.
4384768 No RDMA support on JetPack 6.0 DP and IGX SW 1.0 DP iGPU due to missing nv-p2p kernel module. Expected to be addressed in JP 6.0 GA and IGX SW 1.0 GA respectively.
4190019 Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run. Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead.
4210082 v4l_camera example seg faults at exit.
4339399 High CPU usage observed with video_replayer_distributed application. While the high CPU usage associated with the GXF UCX extension has been fixed since v1.0, distributed applications using the MultiThreadScheduler (with the check_recession_period_ms parameter set to 0 by default) may still experience high CPU usage. Setting the HOLOSCAN_CHECK_RECESSION_PERIOD_MS environment variable to a value greater than 0 (e.g. 1.5) can help reduce CPU usage. However, this may result in increased latency for the application until the MultiThreadScheduler switches to an event-based multithreaded scheduler.
4318442 UCX cuda_ipc protocol doesn't work in Docker containers on x86_64. As a workaround, we are currently disabling the UCX cuda_ipc protocol on all platforms via the UCX_TLS environment variable.
4325468 The V4L2VideoCapture operator only supports YUYV and AB24 source pixel formats, and only outputs the RGBA GXF video format. Other source pixel formats compatible with V4L2 can be manually defined by the user, but they're assumed to be equivalent to RGBA8888.
4325585 Applications using MultiThreadScheduler may exit early due to timeouts. This occurs when the stop_on_deadlock_timeout parameter is improperly set to a value equal to or less than check_recession_period_ms, particularly if check_recession_period_ms is greater than zero.
4301203 HDMI IN fails in v4l2_camera on IGX Orin Devkit for some resolution or formats. Try the latest firmware as a partial fix. Driver-level fixes expected in IGX SW 1.0 GA.
4384348 UCX termination (either ctrl+c , press 'Esc' or clicking close button) is not smooth and can show multiple error messages.
4481171 Running the driver for a distributed applications on IGX Orin devkits fails when connected to other systems through eth1. A workaround is to use eth0 port to connect to other systems for distributed workloads.
4458192 In scenarios where distributed applications have both the driver and workers running on the same host, either within a Docker container or directly on the host, there's a possibility of encountering "Address already in use" errors. A potential solution is to assign a different port number to the HOLOSCAN_HEALTH_CHECK_PORT environment variable (default: 8777), for example, by using export HOLOSCAN_HEALTH_CHECK_PORT=8780.
Wayland: holoscan::viz::Init() with existing GLFW window fails.

Holoscan SDK v2.2.0 Release

02 Jul 00:30
Compare
Choose a tag to compare

Release Artifacts

See supported platforms for compatibility.

Release Notes

New Features and Improvements

Core
  • Explicitly delete the copy constructor and assignment operator of Config/Executor/Graph classes private to prevent copying of these objects. This change ensures that the Config/Executor/Graph classes are not copied, as they are not intended to be copied, which is inefficient. This change is backward compatible, as the classes are still movable.
    • [Internal] Application::fragment_graph_ is now a std::shared_ptr to FragmentGraph to prevent copying of FragmentGraph objects in Python bindings.
    • [Internal] Fragment::graph_ is now a std::shared_ptr to OperatorGraph to prevent copying of OperatorGraph objects in Python bindings.
    • [Internal] Added Fragment::config_shared()/Fragment::executor_shared()/Fragment::graph_shared() to return a shared pointer to the Config/Executor/OperatorGraph objects respectively.
Operators/Resources
  • A new decorator, holoscan.decorator.create_op is provided that can wrap an existing function or generator as a native Python Operator. This new API is still considered experimental and may be updated in a subsequent release based on initial feedback. An example of using this decorator is provided under examples/python_decorator/video_replayer.py as well as in the test applications within python/tests/system/test_decorator_apps.py.
Utils
HoloHub
Documentation

Breaking Changes

Bug fixes

Issue Description
4687735 Fixed a bug where the CountCondition in Python doesn't accept a negative value for the count parameter, even though it is allowed in C++. Note that using a negative value for count is not recommended as it would lead to an infinite loop until it reaches the minimum value of the data type (int64_t) and then starts counting down from the maximum value up to zero.
4689604 Fixed a bug where GXF extensions listed in the config file (YAML) are not loaded when GXFCodeletOp or GXFComponentResource is used. Test cases have been added to verify the fix, and the documentation has been updated to reflect the changes. The HOLOSCAN_WRAP_GXF_COMPONENT_AS_RESOURCE macro has also been updated to support the constructor with no arguments.
4706559 Fixed a bug where nullptr was returned even when no message was available in calls to holoscan::IOContext::receive<T*>() or holoscan::IOContext::receive<std::shared_ptr<T>> in the C++ API. Now, the receive method correctly returns a holoscan::unexpected<holoscan::RuntimeError> value when no message is available.

Known Issues

This section supplies details about issues discovered during development and QA but not resolved in this release.

Issue Description
4062979 When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered.
4267272 AJA drivers cannot be built with RDMA on IGX SW 1.0 DP iGPU due to missing nv-p2p.h. Expected to be addressed in IGX SW 1.0 GA.
4384768 No RDMA support on JetPack 6.0 DP and IGX SW 1.0 DP iGPU due to missing nv-p2p kernel module. Expected to be addressed in JP 6.0 GA and IGX SW 1.0 GA respectively.
4190019 Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run. Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead.
4210082 v4l_camera example seg faults at exit.
4339399 High CPU usage observed with video_replayer_distributed application. While the high CPU usage associated with the GXF UCX extension has been fixed since v1.0, distributed applications using the MultiThreadScheduler (with the check_recession_period_ms parameter set to 0 by default) may still experience high CPU usage. Setting the HOLOSCAN_CHECK_RECESSION_PERIOD_MS environment variable to a value greater than 0 (e.g. 1.5) can help reduce CPU usage. However, this may result in increased latency for the application until the MultiThreadScheduler switches to an event-based multithreaded scheduler.
4318442 UCX cuda_ipc protocol doesn't work in Docker containers on x86_64. As a workaround, we are currently disabling the UCX cuda_ipc protocol on all platforms via the UCX_TLS environment variable.
4325468 The V4L2VideoCapture operator only supports YUYV and AB24 source pixel formats, and only outputs the RGBA GXF video format. Other source pixel formats compatible with V4L2 can be manually defined by the user, but they're assumed to be equivalent to RGBA8888.
4325585 Applications using MultiThreadScheduler may exit early due to timeouts. This occurs when the stop_on_deadlock_timeout parameter is improperly set to a value equal to or less than check_recession_period_ms, particularly if check_recession_period_ms is greater than zero.
4301203 HDMI IN fails in v4l2_camera on IGX Orin Devkit for some resolution or formats. Try the latest firmware as a partial fix. Driver-level fixes expected in IGX SW 1.0 GA.
4384348 UCX termination (either ctrl+c , press 'Esc' or clicking close button) is not smooth and can show multiple error messages.
4481171 Running the driver for a distributed applications on IGX Orin devkits fails when connected to other systems through eth1. A workaround is to use eth0 port to connect to other systems for distributed workloads.
4458192 In scenarios where distributed applications have both the driver and workers running on the same host, either within a Docker container or directly on the host, there's a possibility of encountering "Address already in use" errors. A potential solution is to assign a different port number to the HOLOSCAN_HEALTH_CHECK_PORT environment variable (default: 8777), for example, by using export HOLOSCAN_HEALTH_CHECK_PORT=8780.
Wayland: holoscan::viz::Init() with existing GLFW window fails.

Holoscan SDK v2.1.0

05 Jun 15:30
Compare
Choose a tag to compare

Release Artifacts

See supported platforms for compatibility.

Release Notes

New Features and Improvements

Core
  • A report with execution time statistics for individual operators can now be enabled. This report will contain information like median, 90th percentile and maximum times for operator execution. Setting environment variable HOLOSCAN_ENABLE_GXF_JOB_STATISTICS=true enables this report (it is disabled by default as statistics collection may introduce a minor performance overhead). For more details see the documentation on the feature](https://docs.nvidia.com/holoscan/sdk-user-guide/gxf_job_statistics.html).

  • The holoscan.Tensor object's data property in the Python API now returns an integer (pointer address) instead of a NULL PyCapsule object, potentially avoiding confusion about data availability. Users can confirm the presence of data via the __array_interface__ or __cuda_array_interface__ properties. This change allows for direct access to the data pointer, facilitating debugging and performance optimization.

  • The string representation of the IOSpec object, generated by IOSpec::to_yaml_node(), includes ConditionType information in the type field. It correctly displays kNone when no condition (ConditionType::kNone in C++ and ConditionType.NONE in Python) is explicitly set.

    name: receiver
    io_type: kInput
    typeinfo_name: N8holoscan3gxf6EntityE
    connector_type: kDefault
    conditions:
      - type: kNone
    
  • Enhanced the macros (HOLOSCAN_CONDITION_FORWARD_TEMPLATE, HOLOSCAN_RESOURCE_FORWARD_TEMPLATE, HOLOSCAN_OPERATOR_FORWARD_TEMPLATE, etc.) by using the full namespace of the classes, improving their robustness and adaptability across different namespace scopes.

  • Updated the holoscan.core.py_object_to_arg() method to allow conversion of Python objects to Arg objects using YAML::Node. This resolves type mismatches, such as when the underlying C++ parameter expects an int32_t type but Python uses int64_t.

  • The Python OperatorSpec/ComponentSpec class exposes the inputs and outputs properties, providing direct access to the input and output IO specs. This enhancement simplifies the process of setting conditions on inputs and outputs.

    def setup(self, spec: OperatorSpec):
      spec.input("data")
    
      # Set the NONE condition to the input port named `data`.
      spec.inputs["data"].condition(ConditionType.NONE)
    
      print(spec.inputs["data"])
  • Workflows where an operator connects to multiple downstream operators within the same fragment may see a minor performance boost. This is because of an internal refactoring in how connections between operators are made. Previously a GXF broadcast codelet was automatically inserted into the graph behind the scenes to broadcast the output to multiple receivers. As of this release, direct 1:N connection from an output port is made without the framework needing to insert this extra codelet to enable this.

  • fmt::format support for printing the Parameter class has been added (there is no longer a need to call the get() method to print out the contained value). This allows parameter values to be directly printed in HOLOSCAN_LOG_* statements. For example :

  MetaParameter p = MetaParameter<int>(5);
  HOLOSCAN_LOG_INFO("Formatted parameter value: {}", p);

  // can also pass parameter to fmt::format
  std::string format_message = fmt::format("{}", p);
Operators/Resources
  • Most built-in operators now do additional validation of input tensors and will raise more helpful messages if the dimensions, data type or memory layout of the provided tensors is not as expected. Remaining operators (InferenceOp, InferenceProcessorOp) will updated in the next release.

  • BayerDemosaicOp and FormatConverterOp will now automatically perform host->device copies if needed for either nvidia::gxf::VideoBuffer or Tensor inputs. Previously these operators only did the transfer automatically for nvidia::gxf::VideoBuffer, but not for Tensor and in the case of FormatConverterOp that transfer was only automatically done for pinned host memory. As of this release both operators will only copy unpinned system memory, leaving pinned host memory as-is.

  • When creating Python bindings for C++ operators, it is now possible to register custom type conversion functions for user defined C++ types. These handle conversion to and from a corresponding Python type. See the newly expanded section on creating Python bindings for C++ operators for details.

  • As of this release, all provided Python operators support passing conditions such as CountCondition or PeriodicCondition as positional arguments. In previous releases, there was a limitation that Python operators that wrapped an underlying C++ operator did not support this. As a concrete example, one could now pass a CountCondition to limit the number of frames the visualization operator will run for.

      holoviz = HolovizOp(
          self,
          # add count condition to stop the application after short duration (i.e. for testing)
          CountCondition(self, count),
          name="holoviz",
          **self.kwargs("holoviz"),
      )
  • The AJA NTV2 dependency, and the corresponding AJA Source Operator, have been updated to use the latest official AJA NTV2 17.0.1 release. This new NTV2 version also introduces support for the KONA XM hardware.

  • The Holoviz operator now supports setting the camera for layers rendered in 3d (geometry layer with 3d primitives and depth map layer).
    The camera eye, look at and up vectors can be initialized using parameters or dynamically changed at runtime by providing data at the respective input channels.
    More information can be found in the documentation.
    There is also a new C++ example holoviz_camera.cpp.

  • The Holoviz operator now supports different types of camera pose outputs. In additional to the 4x4 row major projection matrix, a camera extrinsics model of type nvidia::gxf::Pose3D can now also be output. The output type is selected by setting the camera_pose_output_type parameter.

  • The Holoviz operator now supports Wayland. Also the run launch command has been updated to support Wayland.

  • The inference operator (InferenceOp) now supports a new optional parameter, temporal_map, which can be used to specify a frame interval at which inference will be run. For example, setting a value of 10 for a given model will result in inference only being run on every 10th frame. Intermediate frames will output the result from the most recent frame at which inference was run. The interval value is specified per-model, allowing different inference models to be run at different rates.

  • The existing asynchronous scheduling condition is now also available from Python (via holoscan.conditions.AsynchronousCondition). For an example of usage, see the new asynchronous ping example.

  • We introduce the GXFCodeletOp and GXFComponentResource classes, streamlining the import of GXF Codelets and Components into Holoscan applications. These additions simplify the setup process, allowing users to utilize custom GXF components more intuitively and efficiently.

    auto tx = make_operator<ops::GXFCodeletOp>(
        "tx",
        "nvidia::gxf::test::SendTensor",
        make_condition<CountCondition>(15),
        Arg("pool") = make_resource<GXFComponentResource>(
            "pool",
            "nvidia::gxf::BlockMemoryPool",
            Arg("storage_type") = static_cast<int32_t>(1),
            Arg("block_size") = 1024UL,
            Arg("num_blocks") = 2UL));
    tx = GXFCodeletOp(
        self,
        "nvidia::gxf::test::SendTensor",
        CountCondition(self, 15),
        name="tx",
        pool=GXFComponentResource(
            self,
            "nvidia::gxf::BlockMemoryPool",
            name="pool",
            storage_type=1,
            block_size=1024,
            num_blocks=2,
        ),
    )

    Please check out the examples in the examples/import_gxf_components directory for more information on how to use these new classes.

  • When calling op_output.emit from the compute method of a Python operator, it is now possible to provide a third argument that overrides the default choice of the type of object type emitted. This is sometimes needed to emit a certain C++ type from a native Python operator when connecting it to a different Python operator that wraps an underlying C++ operator. For example, one could emit a Python string as a C++ std::string instead of the Python string object via op_output.emit(py_str, "port_name", "std::string"). See additional examples and a table of the C++ types registered by default [here](https://docs.nvidia.com/holoscan/sdk-user-guide/holoscan_create_operator_...

Read more

Holoscan SDK v2.0.0

19 Apr 15:00
1e011a4
Compare
Choose a tag to compare

Release Artifacts

See supported platforms for compatibility.

Release Notes

New Features and Improvements

Core
  • make_condition, make_fragment, make_network_context, make_operator, make_resource, and
    make_scheduler now accept a non-const string or character array for the name parameter.
  • A new event-based mult-thread scheduler (EventBasedScheduler) is available. It is an alternative to the existing, polling-based MultiThreadScheduler and can be used as a drop-in replacement. The only difference in parameters is that it does not take check_recession_period_ms parameter, as there is no such polling interval for this scheduler. It should give similar performance to the MultiThreadScheduler with a very short polling interval, but without the high CPU usage seen for the multi-thread scheduler in that case (due to constant polling for work by one thread).
  • When an exception is raised from the Operator methods start, stop or compute, that exception will first trigger the underlying GXF scheduler to terminate the application graph and then the exception will be raised by Holoscan SDK. This resolves an issue with inconsistent behavior from Python and C++ apps on how exceptions were handled and fixes a crash in C++ apps when an operator raised an exception from the start or stop methods.
  • Now, when an exception occurs during the execution of a Holoscan application, it is propagated to
    the application's run method, allowing users to catch and manage exceptions within their
    application.
    Previously, the Holoscan runtime would catch and log exceptions, with the application continuing
    to run (in Python) or exit (in C++) without a clear indication of the exception's origin.
    Users can catch and manage exceptions by enclosing the run method in a try block.
  • In the case of the holoscan::Fragment::run_async and holoscan.Application.run_async methods
    for C++ and Python, they return std::future and concurrent.futures.Future respectively.
    The revised documentation advises using future.get() in C++ and future.result() in Python to
    wait until the application has completed execution and to address any exceptions that occurred.
Operators
  • V4L2 Video Capture: added support to set manual exposure and gain values for cameras that support it.
  • Inference: one can now run multiple instances of the Inference operator in a single application without ressource conflicts.
Utils
  • Can now build from source for iGPU (IGX iGPU, Jetpack) from a non-iGPU system (IGX dGPU, x86_64)
  • The NGC container now supports packaging and running Holoscan Application Packages using the Holoscan CLI.
  • CLI runner - better handling of the use of available GPUs by reading the package manifest file and check the system for available GPUs. New --gpus argument to override the default values.

Breaking Changes

  • The VideoStreamRecorderOp and VideoStreamReplayerOp now work without requiring the libgxf_stream_playback.so extension. Now that the extension is unused, it has been removed from the SDK and should no longer be listed under the extensions section of application YAML files using these operators.

  • As of version 2.0, we have removed certain Python bindings to align with the unified logger interface:

    • Removed APIs:
      • holoscan.logger.enable_backtrace()
      • holoscan.logger.disable_backtrace()
      • holoscan.logger.dump_backtrace()
      • holoscan.logger.should_backtrace()
      • holoscan.logger.flush()
      • holoscan.logger.flush_level()
      • holoscan.logger.flush_on()
    • However, the following APIs remain accessible for Python. These are intended for logging in Holoscan's core or for C++ operators (e.g., using the HOLOSCAN_LOG_INFO macro), and are not designed for Python's logging framework. Python API users are advised to utilize the standard logging module for their logging needs:
      • holoscan.logger.LogLevel
      • holoscan.logger.log_level()
      • holoscan.logger.set_log_level()
      • holoscan.logger.set_log_pattern()
  • Several GXF headers have moved from gxf/std to gxf/core:

    • parameter_parser.hpp
    • parameter_parser_std.hpp
    • parameter_registrar.hpp
    • parameter_storage.hpp
    • parameter_wrapper.hpp
    • resource_manager.hpp
    • resource_registrar.hpp
    • type_registry.hpp
  • Some C++ code for tensor interoperability has been upstreamed from Holoscan SDK into GXF. The public holoscan::Tensor class will remain, but there have been a small number of backward incompatible changes in related C++ classes and methods in this release. Most of these were used internally and are unlikely to affect existing applications.

    • supporting classes holoscan::gxf::GXFTensor and holoscan::gxf::GXFMemoryBuffer have been removed. The DLPack functionality that was formerly in holoscan::gxf::GXFTensor is now available upstream in GXF's nvidia::gxf::Tensor.
    • The struct holoscan::gxf::DLManagedTensorCtx has been renamed to holoscan::gxf::DLManagedTensorContext and is now just an alias for nvidia::gxf::DLManagedTensorContext. It also has two additional fields (dl_shape and dl_strides to hold shape/stride information used by DLPack).
    • holoscan::gxf::DLManagedMemoryBuffer is now an alias to nvidia::gxf::DLManagedMemoryBuffer
  • The GXF UCX extension, used in distributed applications, now sends data asynchronously by default, which can lead to issues such as insufficient memory on the transmitter side when a memory pool is used. Specifically, the concern is only for operators that have a memory pool and connect to an operator in a separate fragment of the distributed application. As a workaround, users can increase the num_blocks parameter to a higher value in the BlockMemoryPool or use the UnboundedAllocator to avoid the problem. This issue will be addressed in a future release by providing a more robust solution to handle the asynchronous data transmission feature of the UCX extension, eliminating the need for manual intervention (see Known Issue 4601414).

    • For fragments using a BlockMemoryPool, the num_blocks parameter can be increased to a higher value to avoid the issue. For example, the following code snippet shows the existing BlockMemoryPool resource being created with a higher number of blocks:

      recorder_format_converter = make_operator<ops::FormatConverterOp>(
        "recorder_format_converter",
        from_config("recorder_format_converter"),
        Arg("pool") =
          //make_resource<BlockMemoryPool>("pool", 1, source_block_size, source_num_blocks));
          make_resource<BlockMemoryPool>("pool", 1, source_block_size, source_num_blocks * 2));
      source_pool_kwargs = dict(
          storage_type=MemoryStorageType.DEVICE,
          block_size=source_block_size,
          #num_blocks=source_num_blocks,
          num_blocks=source_num_blocks * 2,
      )
      recorder_format_converter = FormatConverterOp(
              self,
              name="recorder_format_converter",
              pool=BlockMemoryPool(self, name="pool", **source_pool_kwargs),
              **self.kwargs("recorder_format_converter"),
          )
      )
    • Since the underlying UCXTransmitter attempts to send the emitted data regardless of the status of the downstream Operator input port's message queue, simply doubling the num_blocks may not suffice in cases where the receiver operator's processing time is slower than that of the sender operator.

    • If you encounter the issue, consider using the UnboundedAllocator instead of the BlockMemoryPool to avoid the problem. The UnboundedAllocator does not have a fixed number of blocks and can allocate memory as needed, though it can cause some overhead due to the lack of a fixed memory pool size and may lead to memory exhaustion if the memory is not released in a timely manner.
      The following code snippet shows how to use the UnboundedAllocator:

      ...
        Arg("pool") = make_resource<UnboundedAllocator>("pool");
      from holoscan.resources import UnboundedAllocator
      ...
              pool=UnboundedAllocator(self, name="pool"),
      ...

Bug fixes

Issue Description
4381269 Fixed a bug that caused memory exhaustion when compiling the SDK in the VSCode Dev Container (using 'Tasks: Run Build Task') due to the missing CMAKE_BUILD_PARALLEL_LEVEL environment variable. Users can specify the number of jobs with the --parallel option (e.g., ./run vscode --parallel 16).
4569102 Fixed an issue where the log level was not updated from the environment variable when multiple Application classes were created during the session. Now, the log level setting in Application class allows for a reset from the environment variable if overridden.
4578099 Fixed a segfault in FormatConverterOp if used with a BlockMemoryPool with insufficient capacity to create the output tensor.
4571581 Fixed an issue where the documentation for the built-in operators was either missing or incorrectly rendered.
4591763 Application crashes if an exception is thrown from Operator::start or Operator::stop
4595680 Fixed an issue that caused the Inference operator to fail when multiple instances were composed in a...
Read more

Holoscan SDK v1.0.3

09 Feb 20:38
Compare
Choose a tag to compare

Release Artifacts

Release Notes

New Features and Improvements

Core
  • Allow operator input and output ports to have matching names
  • Application graphs with cycles are now supported
    (example)
  • Cycles in the graph are also supported in Data Flow Tracking
  • An informative error message is now raised if an unsupported condition type is provided to IOSpec::condition.
  • User-defined operators can now define parameters that are of type complex<float> or complex<double>. These parameters can either be parsed from a YAML config (e.g. using a string like "1.0 + 2.0j") or passed as a holoscan::Arg to the operator constructor.
  • Holoscan tensors containing data of type complex<float> or complex<double> can now be used.
  • Python applications can now send CuPy, NumPy or other tensor types with complex-valued data between fragments of a multi-fragment application. Previously, this only worked within a single fragment.
  • Many C++ API description methods and corresponding Python API __repr__ methods have been improved.
    • The IOSpec class now has a description method and corresponding Python __repr__ method.
    • A bug was fixed where the Arg class __repr__ could raise UnicodeDecodeError for uint8_t or int8_t argument types
    • The NetworkContext and Scheduler print more comprehensive information.
    • Python bindings for GXF conditions, resources and operators have an improved __repr__ that makes use of the underlying C++ description methods.
  • The HOLOSCAN_UCX_PORTS environment variable allows users to define preferred port numbers for the SDK's inter-node communication in a distributed application, especially in environments where specific ports need to be predetermined, such as Kubernetes.
  • A Condition or Resource class can be added to a Python operator after construction via its add_arg method.
  • Distributed applications can now leverage RDMA transports with MLNX_OFED drivers. Tested with RoCE.
  • The HOLOSCAN_HEALTH_CHECK_PORT environment variable allows users to define a port number for the SDK's health check endpoint in a distributed application.
  • A set of available keys in an Application or Fragments's YAML configuration file can now be determined via a new config_keys() method in the C++ API or config_keys method from Python.
  • Debugging (tracing and profiling) of Python operators is now fully supported.
    • Previously, the compute, initialize, start, and stop methods of the Holoscan Operator were not compatible with Python tracing/profiling in earlier releases.
    • Debugging methods of Python operators with the VSCode/PyCharm debugger using PyDev.Debugger (pydevd) is now feasible, as well as profiling or gathering coverage data using cProfile or coverage.py.
    • For comprehensive information, refer to the Debugging section in the SDK User Guide.
Operators
  • HoloViz
    • Add views support for imgui layers
    • Support additional color formats (doc)
    • Support for multiple instances (doc)
    • Add row_pitch argument to ReadFrameBuffer function.
  • Inference
    • The ONNX Runtime (ORT) inference backend is now a plugin, like the Torch backend, allowing you to use the inference operator without requiring an installation of ORT when using other backends (like TensorRT or Torch).
Utils
  • Added a Dockerfile that contains only runtime dependencies. This Dockerfile can be built by running ./run build_run_image at the top of the repository, creating an image that is ~8.6 GB vs. the ~13 GB build container from ./run build_image. (doc)
  • The run script in the git repository had a couple of updates and improvements, including:
    • Allow building as root
    • Allow running of the build container without display
    • Naming build image, build directories, and install directories with the target architecture and GPU (ex: build -> build-aarch64-dgpu)
    • Support building on system without tty support
    • Support running on system without xhost support
    • Added more flags (see ./run help and ./run <cmd> --help for details)
Packaging
  • Mellanox OFED user libraries were added to the NGC container to allow the use of RDMA transports from the container.
Documentation
  • The user guide source code and tooling is now released on GitHub (link)

Breaking Changes

  • H264 operator and applications were moved from the SDK to HoloHub (MR)
  • For distributed applications, there is a change to the emit/receive behavior for array-like objects (e.g. PyTorch tensor) between operators within a fragment. Previously (in v0.6.x), the array-like object type was always preserved for within-fragment emit/receive. However, now now any host array-like will be recevied as a NumPy array (and any device array-like will be received as a CuPy array). Making within-fragment emit/receive behavior consistent with between-fragment emit/receive behavior was necessary to implement the fix for issue 4290043.
  • Building against Ubuntu 22.04, debian packages and python wheels require GLIBC_2.35 or above.

Bug fixes

Issue Description
4185976 Cycle in a graph is not supported. As a consequence, the endoscopy tool tracking example using input from an AJA video card in enabled overlay configuration is unfunctional. This is planned to be addressed in the next version of the SDK.
4196152 Getting "Unable to find component from the name ''" error message when using InferenceOp with Data Flow Tracking enabled.
4211747 Communication of GPU tensors between fragments in a distributed application can only use device 0
4212743 Holoscan CLI packager copies into the App Package the unrelated files and folders in the same folder than the model file.
4232453 A segfault occurs if a native Python operator __init__ assigns a new attribute that overrides an existing base class attribute or method. A segfault will also occur if any exception is raised during Operator.__init__ or Application.__init__ before the parent class __init__ was called.
4206197 Distributed apps hang if multiple input/output ports are connected between two operators in different fragments.
3599303 Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags.
4187787 TensorRT backend in the Inference operator prints Unknown embedded device detected. Using 52000MiB as the allocation cap for memory on embedded devices on IGX Orin (iGPU). Addressed in TensorRT 8.6+.
4194109 AppDriver is executing fragments' compose() method which can be avoided.
4260969 App add_flow causes issue if called more than once between a pair of operators.
4265393 Release 1.0-ea1 and 1.0-ea2 fail to run distributed applications with workers on two or more nodes.
4272363 A segfault may occur if an operator's output port containing GXF Tensor data is linked to multiple operators within the MultiThreadScheduler.
4290043 Bug in Python implicit broadcast of non-TensorMap types when at least one target operator is in a different fragment.
4293729 Python application using MultiThreadScheduler (including distributed application) may fail with GIL related error if SDK was compiled in debug mode.
4101714 Vulkan applications fail (vk::UnknownError) in containers on iGPU due to missing iGPU device node being mounted in the container. Workaround documented in run instructions.
3881725 VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app on Clara AGX developer kits. Fix available in CUDA drivers 520. Workaround implemented since v0.4 to retry automatically.
4293741 Python application with more than two operators (mixed use of pure Python operator and operator wrapping C++ operator), using MultiThreadScheduler (including distributed app) and sending Python tensor can deadlock at runtime.
4313690 Failure to initialize BayerDemosaicOp in applications using the C++ API
4187826 Torch backend in the Inference operator is not supported on Tegra's integrated GPU.
4336947 The dev_id parameter of the CudaStreamPool resource is ignored.
4344061 Native Python operator overrides of the start, stop or initialize methods don't handle exceptions properly
4344408 The distributed application displays an error message if port 8777 is already in use.
4363945 Checking if a key exists in an application's config file results in an error being logged.
Fixed bad cast exception when defining optional ports enablement (buffer input, output, camera pose) for the Holoviz operator from a YAML configuration file.
Fixed invalid stride alignment of...
Read more

Holoscan SDK v0.6.0

31 Jul 20:40
Compare
Choose a tag to compare

Release Artifacts

Release Notes

New Features and Improvements

Core
  • Multi-fragments application support for distributed workloads (doc)
  • Async run (doc)
  • Multithread scheduler (doc1, doc2)
  • Async and Periodic conditions (doc1, doc2)
  • Realtime and Manual Clock classes (doc)
  • Optional GXF Parameter support
  • Topologically sorted Graph initialization (doc)
  • Data Flow Tracking (doc)
  • Lazy import of individual operator modules in python
Operators
  • A V4L2 operator supporting input video streams USB and HDMI IN was added (example)
  • Improvements to the Inference (previously MultiAIInference) operator (doc), including:
    • Torch support
    • Multi backend support
    • Multi I/O support
  • Improvements to the Holoviz (visualization) operator (doc), including:
    • Dynamic text support
    • Multi-views support
    • Multi-GPU data transfer support
Utils
  • Application packager and runner (doc)
  • A new HOLOSCAN_LOG_FORMAT environment variable has been added to allow user to modify logger message format at runtime
  • Auto loading of log verbosity environment variable (HOLOSCAN_LOG_LEVEL) and YAML config path (HOLOSCAN_CONFIG_PATH)
  • Script to decode GXF entities files (doc)
HoloHub
Documentation

Breaking API Changes

Core
  • The function holoscan::InputContext::receive has been modified to return holoscan::expected<DataT, holoscan::RuntimeError> instead of std::shared_ptr<DataT> where it returns either a valid value or an error (with the type and explanation of the error). Note that IO objects are not all assumed to be wrapped in a std::shared_ptr anymore.
  • Messages of type gxf::Entity between GXF based Operators and Holoscan Native Operators have been modified to type holoscan::TensorMap in C++ and dict type of objects in Python .
Operators
  • The deprecated TensorRT inference operator was removed in favor of the Multi AI Inference operator, which was renamed to Inference operator (doc):
    • Include headers:
      • holoscan/operators/tensor_rt/tensor_rt_inference.hpp removed
      • holoscan/operators/multiai_inference/multiai_inference.hpp renamed to holoscan/operators/inference/inference.hpp
      • holoscan/operators/multiai_postprocessor/multiai_postprocessor.hpp renamed to holoscan/operators/inference_processor/inference_processor.hpp
    • C++ classes:
      • holoscan::ops::TensorRtInferenceOp removed
      • holoscan::ops::MultiAIInferenceOp renamed to holoscan::ops::InferenceOp
      • holoscan::ops::MultiAIPostprocessorOp renamed to holoscan::ops::InferenceProcessorOp
    • CMake targets:
      • holoscan::ops::tensor_rt removed
      • holoscan::ops::multiai_inference renamed to holoscan::ops::inference
      • holoscan::ops::multiai_postprocessor renamed to holoscan::ops::inference_processor
Utils
  • The function holoscan::load_env_log_level has been removed. The HOLOSCAN_LOG_LEVEL environment is now loaded automatically.
HoloHub
  • The class ops::VideoDecoderOp has been replaced with the classes ops::VideoDecoderRequestOp, ops::VideoDecoderResponseOp and ops::VideoDecoderContext
  • The class ops::VideoEncoderOp has been replaced with the classes ops::VideoEncoderRequestOp, ops::VideoEncoderResponseOp and ops::VideoEncoderContext

Bug fixes

Issue Description
3762996 nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode.
4048062 Warning or error when deleting TensorRT runtime ahead of deserialized engines for some versions of TensorRT
4036186 H264 encoder/decoder are not supported on iGPU

Supported Platforms

Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.

Platform OS
NVIDIA IGX Orin Developer Kit NVIDIA HoloPack 2.0 (L4T r35.3.1) or
Meta Tegra Holoscan 0.6.0 (L4T r35.3.1)
NVIDIA Jetson AGX Orin Developer Kit NVIDIA JetPack r35.1.1
NVIDIA Clara AGX Developer Kit NVIDIA HoloPack 1.2 (L4T r34.1.2) or
Meta Tegra Holoscan 0.6.0 (L4T r35.3.1)
x86_64 platforms with Ampere GPU or above
(tested with RTX6000 and A6000)
Ubuntu 20.04

Known Issues

This section supplies details about issues discovered during development and QA but not resolved in this release.

Issue Description
3878494 Inference fails after tensorrt engine file is first created using BlockMemoryPool. Fix available in TensorRT 8.4.1. Use UnboundedAllocator as a workaround.
3599303 Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags.
3881725 VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app on Clara AGX developer kits. Fix available in CUDA drivers 520. Workaround implemented since v0.4 to retry automatically.
4047688 H264 applications are missing dependencies (nvidia-l4t-multimedia-utils) to run in the arm64 dGPU container
4062979 When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered.
4068454 Crash on systems with NVIDIA and non-NVIDIA GPUs. Workaround documented in Troubleshooting section of the GitHub README.
4101714 Vulkan applications fail (vk::UnknownError) in containers on iGPU due to missing iGPU device node being mounted in the container. Workaround documented in run instructions.
4171337 AJA with RDMA is not working on integrated GPU (IGX or AGX Orin) due to conflicts between the nvidia-p2p and nvidia driver symbols (nvidia_p2p_dma_map_pages). Fixed in JetPack 5.1.2, expected in HoloPack 2.1
4185260 H264 application process hangs after X11 video exit.
4185976 Cycle in a graph is not supported. As a consequence, the endoscopy tool tracking example using input from an AJA video card in enabled overlay configuration is unfunctional. This is planned to be addressed in the next version of the SDK.
4187826 Torch backend in the Inference operator is not supported on Tegra's integrated GPU.
4187787 TensorRT backend in the Inference operator prints Unknown embedded device detected. Using 52000MiB as the allocation cap for memory on embedded devices on IGX Orin (iGPU). Addressed in TensorRT 8.6+.
4190019 Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run. Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead.
4196152 Getting "Unable to find component from the name ''" error message when using InferenceOp with Data Flow Tracking enabled.
4199282 H264 applications may fail on x86_64 du...
Read more

Holoscan SDK v0.5.1

08 Jun 19:03
Compare
Choose a tag to compare

Release Artifacts

Release Notes

Additions

This release of the Holoscan SDK provide the following additions:

Support for the NVIDIA IGX Orin Developer Kit

The Holoscan SDK 0.5.1 adds support for the NVIDIA IGX Orin Developer Kit in both iGPU or dGPU modes. That support is enabled by the release of the Holopack 2.0 Developer Preview now available through the latest version of the SDK Manager. Python wheels and Debian packages for the arm64/aarch64 architecture support both iGPU and dGPU. Starting with 0.5.1, the Holoscan container on NGC now offers two separate tags, one for iGPU and one for dGPU.

Lazy loading of python modules

Python users might be interested in using the Holoscan SDK without requiring to import every operators, which requires to have all their dependencies available in the python environment (example: TensorRT...). With 0.5.1, a user can create an Holoscan application by only importing the modules they require, for example, importing holoscan.core only and not holoscan.operators.

Note: in 0.6.0, the operators will be broken down in separate modules to offer further importing granularity and lower dependency requirements.

ffmpeg support in the Holoscan container on NGC

Facilitates conversions of standard video formats to gxf entities in the Holoscan containers (example)

Improved L4T Compute Assist container

The L4T Compute Assist container - used to run compute workloads on the iGPU of a developer kit configured for dGPU - now includes the deviceQuery executable to facilitate validating its configuration, along with updated troubleshooting steps.

Issues Fixed

Issue Description
- Fixed instructions in documentation for datasets download and debian package installation

Supported Platforms

Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.

Platform OS
NVIDIA Clara AGX Developer Kit - NVIDIA Holopack 1.2 (L4T r34.1.2)
- Meta Tegra Holoscan 0.5.0 (L4T r35.2.1)
NVIDIA IGX Orin [ES] Developer Kit - NVIDIA Holopack 1.2 (L4T r34.1.2)
- Meta Tegra Holoscan 0.5.0 (L4T r35.2.1)
NVIDIA IGX Orin Developer Kit - NVIDIA Holopack 2.0 (L4T r35.4.0)
- Meta Tegra Holoscan 0.5.1 (L4T r35.3.1)
x86_64 platforms with Ampere GPU or above
(tested with RTX6000 and A6000)
Ubuntu 20.04

Known Issues

This section supplies details about issues discovered during development and QA but not resolved in this release.

Issue Description
3878494 Inference fails after tensorrt engine file is first created using BlockMemoryPool. Fix available in TensorRT 8.4.1. Use UnboundedAllocator as a workaround.
3762996 nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. Install nvidia-peer-memory following the RDMA instructions in the Holoscan SDK User Guide.
3655489 Installing dGPU drivers can remove nvgpuswitch.py script from the executable search path. Explicitly including /opt/nvidia/l4t-gputools/bin in the PATH environment variable ensures this script can be found for execution.
3599303 Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags.
3633688 RDMA on the NVIDIA IGX Orin [ES] Developer Kit (holoscan-devkit) is not functional. PCIe switch firmware update fixed the issue. RDMA for the Clara AGX Developer Kit is functional and unaffected by this issue.
3881725 VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app. Fix available in CUDA drivers 520. Workaround implemented in v0.4 to retry automatically.
4048062 Warning or error when deleting TensorRT runtime ahead of deserialized engines for some versions of TensorRT
4036186 H264 encoder/decoder are not supported on iGPU
4047688 H264 applications are not able to run in the arm64 dGPU container
4101714 --privileged permission required to run rendering applications from the Holoscan iGPU container on IGX Orin Developer Kit with Holopack 2.0 DP
4116861 H264 video encoding fails on IGX Orin Developer Kit with Holopack 2.0 DP

Holoscan SDK v0.5.0

03 Apr 14:26
Compare
Choose a tag to compare

Release Artifacts

Release Notes

New Features

This release of the Holoscan SDK along with additions to HoloHub provide the following main features:

H264 encoder/decoder support

Operators to support H264 bitstream accelerated encoder and decoder were added to HoloHub, as illustrated by two new applications: h264_video_decode and h264_endoscopy_tool_tracking.

iGPU compute support on Holoscan Developer Kits

The L4T Compute Assist container is now available on NGC to perform computation on the integrated GPU (iGPU) of Holoscan Developer Kits configured to use their discrete GPU (dGPU), allowing to run workloads on both GPUs in parallel.

Use Holoscan operators in GXF applications

Infrastructure and documentation were added to wrap Holoscan operators as GXF codelets so they can be used by other frameworks which use GXF extensions.

x86_64 physical I/O support

The Holoscan SDK now officially support physical I/O on x86_64 platforms. The High Speed endoscopy application on HoloHub has been tested with Rivermax/GPU Direct RDMA support and offers similar performances as previously reported with the Holoscan Developer Kits.

Depth-map rendering

The Holoscan SDK visualization module (referred to as Holoviz) adds depth-map rendering capabilities to support displaying inference results with depth information.

New examples

The Holoscan SDK now provides a new suite of examples with associated step-by-step documentation to better introduce users to the SDK, taking them from a Hello World example to an application that deploys an ultrasound segmentation inference. Additional examples are also available to demonstrate how to integrate sensors and third-party frameworks into their workflow.

Changes from previous release

  • All sample applications along with domain specific operators were migrated from the Holoscan SDK
    to HoloHub.
  • Most operators have been transitioned to native implementations.

Issues Fixed

Issue Description
3834424 Ultrasound segmentation application is not functional on NVIDIA IGX Orin [ES] Developer Kit with iGPU configuration in deployment stack
3842899 High-Speed Endoscopy application is not supported in deployment stack
3897810 Applications not working on x86_64 systems with multiple GPUs
3936290 Cannot run exclusive display from docker container

Supported Platforms

Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.

Platform OS
NVIDIA Clara AGX Developer Kit - NVIDIA Holopack 1.2 (L4T r34.1.2)
- Meta Tegra Holoscan 0.5.0 (L4T r35.2.1)
NVIDIA IGX Orin [ES] Developer Kit - NVIDIA Holopack 1.2 (L4T r34.1.2)
- Meta Tegra Holoscan 0.5.0 (L4T r35.2.1)
NVIDIA IGX Orin Developer Kit Meta Tegra Holoscan 0.5.0 (L4T r35.2.1)
x86_64 platforms with Ampere GPU or above
(tested with RTX6000 and A6000)
Ubuntu 20.04

Known Issues

This section supplies details about issues discovered during development and QA but not resolved in this release.

Issue Description
3878494 Inference fails after tensorrt engine file is first created using BlockMemoryPool. Fix available in TensorRT 8.4.1. Use UnboundedAllocator as a workaround.
3762996 nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. Install nvidia-peer-memory following the RDMA instructions in the Holoscan SDK User Guide.
3655489 Installing dGPU drivers can remove nvgpuswitch.py script from the executable search path. Explicitly including /opt/nvidia/l4t-gputools/bin in the PATH environment variable ensures this script can be found for execution.
3599303 Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags.
3633688 RDMA on the NVIDIA IGX Orin [ES] Developer Kit (holoscan-devkit) is not functional. PCIe switch firmware update fixed the issue. RDMA for the Clara AGX Developer Kit is functional and unaffected by this issue.
3881725 VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app. Fix available in CUDA drivers 520. Workaround implemented in v0.4 to retry automatically.
4048062 Warning or error when deleting TensorRT runtime ahead of deserialized engines for some versions of TensorRT
4036186 H264 encoder/decoder are not supported on iGPU
4047688 H264 applications are missing dependencies (nvidia-l4t-multimedia-utils) to run in the arm64 dGPU container

Holoscan SDK v0.4.1

06 Feb 21:07
Compare
Choose a tag to compare

Release Notes

Changes from previous release

  • HoloPack: update to version 1.2
  • Python: throw warnings instead of exceptions if a GXF extension cannot be loaded, to unblock execution if the operator using that extension is not needed by the application.

Issues Fixed

  • NGC container: fix issue related to expired signing key preventing to call apt update
  • Source: fix issues in Dockerfile related to missing pinned dependency and expired signing key

Please see the list of known issues below for more information.

Supported Platforms

Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.

Description Supported Version
Supported NVIDIA® Tegra® Linux Driver Package (L4T) NVIDIA® Holopack 1.2 -- R34.1.2
Supported Jetson Platforms Holoscan Developer Kits
Supported x86_64 Platforms Ubuntu 20.04 with Ampere GPU or above
(tested with RTX6000 and A6000)
Supported Software for Clara AGX Developer Kit with NVIDIA® RTX6000
and IGX Orin Developer Kit with NVIDIA® A6000
NVIDIA® Driver 510.73.08
CUDA 11.6.1
TensorRT 8.2.3
GXF 2.5
AJA NTV2 SDK 16.2

Known Issues

This section supplies details about issues discovered during development and QA but not resolved in this release.

Issue Description
3878494 Inference fails after tensorrt engine file is first created using BlockMemoryPool
3762996 nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. Install nvidia-peer-memory following the RDMA instructions in the Holoscan SDK User Guide.
3655489 Installing dGPU drivers can remove nvgpuswitch.py script from the executable search path. Explicitly including /opt/nvidia/l4t-gputools/bin in the PATH environment variable ensures this script can be found for execution.
3599303 Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags.
3633688 RDMA on the NVIDIA IGX Orin Developer Kit (holoscan-devkit) is not functional. PCIe switch firmware update fixed the issue. RDMA for the Clara AGX Developer Kit is functional and unaffected by this issue.
3834424 Ultrasound segmentation application is not functional on NVIDIA IGX Orin Developer Kit (holoscan-devkit) with iGPU configuration in deployment stack
3842899 High-Speed Endoscopy application is not supported in deployment stack.
3881725 VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app (workaround implemented in v0.4 and fix in available in 520 drivers)
3897810 Applications not working on x86_64 systems with multiple GPUs
3936290 Cannot run exclusive display from docker container