Holoscan SDK v2.1.0
Release Artifacts
- π Docker container: tag
v2.1.0-dgpu
andv2.1.0-igpu
- π Python wheel:
pip install holoscan==2.1.0
- π¦οΈ Debian packages:
2.1.0.1-1
- π Documentation
See supported platforms for compatibility.
Release Notes
New Features and Improvements
Core
-
A report with execution time statistics for individual operators can now be enabled. This report will contain information like median, 90th percentile and maximum times for operator execution. Setting environment variable
HOLOSCAN_ENABLE_GXF_JOB_STATISTICS=true
enables this report (it is disabled by default as statistics collection may introduce a minor performance overhead). For more details see the documentation on the feature](https://docs.nvidia.com/holoscan/sdk-user-guide/gxf_job_statistics.html). -
The
holoscan.Tensor
object'sdata
property in the Python API now returns an integer (pointer address) instead of a NULL PyCapsule object, potentially avoiding confusion about data availability. Users can confirm the presence of data via the__array_interface__
or__cuda_array_interface__
properties. This change allows for direct access to the data pointer, facilitating debugging and performance optimization. -
The string representation of the
IOSpec
object, generated byIOSpec::to_yaml_node()
, includesConditionType
information in thetype
field. It correctly displayskNone
when no condition (ConditionType::kNone
in C++ andConditionType.NONE
in Python) is explicitly set.name: receiver io_type: kInput typeinfo_name: N8holoscan3gxf6EntityE connector_type: kDefault conditions: - type: kNone
-
Enhanced the macros (
HOLOSCAN_CONDITION_FORWARD_TEMPLATE
,HOLOSCAN_RESOURCE_FORWARD_TEMPLATE
,HOLOSCAN_OPERATOR_FORWARD_TEMPLATE
, etc.) by using the full namespace of the classes, improving their robustness and adaptability across different namespace scopes. -
Updated the
holoscan.core.py_object_to_arg()
method to allow conversion of Python objects toArg
objects usingYAML::Node
. This resolves type mismatches, such as when the underlying C++ parameter expects an int32_t type but Python uses int64_t. -
The Python OperatorSpec/ComponentSpec class exposes the inputs and outputs properties, providing direct access to the input and output IO specs. This enhancement simplifies the process of setting conditions on inputs and outputs.
def setup(self, spec: OperatorSpec): spec.input("data") # Set the NONE condition to the input port named `data`. spec.inputs["data"].condition(ConditionType.NONE) print(spec.inputs["data"])
-
Workflows where an operator connects to multiple downstream operators within the same fragment may see a minor performance boost. This is because of an internal refactoring in how connections between operators are made. Previously a GXF broadcast codelet was automatically inserted into the graph behind the scenes to broadcast the output to multiple receivers. As of this release, direct 1:N connection from an output port is made without the framework needing to insert this extra codelet to enable this.
-
fmt::format
support for printing theParameter
class has been added (there is no longer a need to call theget()
method to print out the contained value). This allows parameter values to be directly printed inHOLOSCAN_LOG_*
statements. For example :
MetaParameter p = MetaParameter<int>(5);
HOLOSCAN_LOG_INFO("Formatted parameter value: {}", p);
// can also pass parameter to fmt::format
std::string format_message = fmt::format("{}", p);
Operators/Resources
-
Most built-in operators now do additional validation of input tensors and will raise more helpful messages if the dimensions, data type or memory layout of the provided tensors is not as expected. Remaining operators (
InferenceOp
,InferenceProcessorOp
) will updated in the next release. -
BayerDemosaicOp
andFormatConverterOp
will now automatically perform host->device copies if needed for eithernvidia::gxf::VideoBuffer
orTensor
inputs. Previously these operators only did the transfer automatically fornvidia::gxf::VideoBuffer
, but not forTensor
and in the case ofFormatConverterOp
that transfer was only automatically done for pinned host memory. As of this release both operators will only copy unpinned system memory, leaving pinned host memory as-is. -
When creating Python bindings for C++ operators, it is now possible to register custom type conversion functions for user defined C++ types. These handle conversion to and from a corresponding Python type. See the newly expanded section on creating Python bindings for C++ operators for details.
-
As of this release, all provided Python operators support passing conditions such as
CountCondition
orPeriodicCondition
as positional arguments. In previous releases, there was a limitation that Python operators that wrapped an underlying C++ operator did not support this. As a concrete example, one could now pass aCountCondition
to limit the number of frames the visualization operator will run for.holoviz = HolovizOp( self, # add count condition to stop the application after short duration (i.e. for testing) CountCondition(self, count), name="holoviz", **self.kwargs("holoviz"), )
-
The AJA NTV2 dependency, and the corresponding AJA Source Operator, have been updated to use the latest official AJA NTV2 17.0.1 release. This new NTV2 version also introduces support for the KONA XM hardware.
-
The Holoviz operator now supports setting the camera for layers rendered in 3d (geometry layer with 3d primitives and depth map layer).
The camera eye, look at and up vectors can be initialized using parameters or dynamically changed at runtime by providing data at the respective input channels.
More information can be found in the documentation.
There is also a new C++ example holoviz_camera.cpp. -
The Holoviz operator now supports different types of camera pose outputs. In additional to the 4x4 row major projection matrix, a camera extrinsics model of type
nvidia::gxf::Pose3D
can now also be output. The output type is selected by setting thecamera_pose_output_type
parameter. -
The Holoviz operator now supports Wayland. Also the
run launch
command has been updated to support Wayland. -
The inference operator (
InferenceOp
) now supports a new optional parameter,temporal_map
, which can be used to specify a frame interval at which inference will be run. For example, setting a value of 10 for a given model will result in inference only being run on every 10th frame. Intermediate frames will output the result from the most recent frame at which inference was run. The interval value is specified per-model, allowing different inference models to be run at different rates. -
The existing asynchronous scheduling condition is now also available from Python (via
holoscan.conditions.AsynchronousCondition
). For an example of usage, see the new asynchronous ping example. -
We introduce the
GXFCodeletOp
andGXFComponentResource
classes, streamlining the import of GXF Codelets and Components into Holoscan applications. These additions simplify the setup process, allowing users to utilize custom GXF components more intuitively and efficiently.auto tx = make_operator<ops::GXFCodeletOp>( "tx", "nvidia::gxf::test::SendTensor", make_condition<CountCondition>(15), Arg("pool") = make_resource<GXFComponentResource>( "pool", "nvidia::gxf::BlockMemoryPool", Arg("storage_type") = static_cast<int32_t>(1), Arg("block_size") = 1024UL, Arg("num_blocks") = 2UL));
tx = GXFCodeletOp( self, "nvidia::gxf::test::SendTensor", CountCondition(self, 15), name="tx", pool=GXFComponentResource( self, "nvidia::gxf::BlockMemoryPool", name="pool", storage_type=1, block_size=1024, num_blocks=2, ), )
Please check out the examples in the examples/import_gxf_components directory for more information on how to use these new classes.
-
When calling
op_output.emit
from thecompute
method of a Python operator, it is now possible to provide a third argument that overrides the default choice of the type of object type emitted. This is sometimes needed to emit a certain C++ type from a native Python operator when connecting it to a different Python operator that wraps an underlying C++ operator. For example, one could emit a Python string as a C++std::string
instead of the Python string object viaop_output.emit(py_str, "port_name", "std::string")
. See additional examples and a table of the C++ types registered by default here. -
The documentation strings of the built-in operators that support use of a
BlockMemoryPool
allocator now have detailed descriptions of the necessary block size and number of blocks that will be needed. This information appears under the header "==Device Memory Requirements==".
Utils
- The new Data Exporter C++ API (
DataExporter
andCsvDataExporter
) is now available. This API can be used to export Holoscan applications output to CSV files for Holoscan Federated Analytics applications.DataExporter
is a base class to support exporting Holoscan applications output in different formats.CsvDataExporter
is a class derived fromDataExporter
to support exporting Holoscan applications output to CSV files. - The Holoscan containers now default to
NVIDIA_DRIVER_CAPABILITIES=all
, removing the need to set it withdocker run -e ...
. That value can still be overridden with manual-e NVIDIA_DRIVER_CAPABILITIES=...
. - The
run
script now checks for X11 and Wayland and passes the according options to thedocker
command
HoloHub
Documentation
Breaking Changes
Bug fixes
Issue | Description |
---|---|
4616519 | Resolved an issue where standalone fragments without UCX connections were not executed. The fix ensures the internal connection map is initialized for each fragment regardless of UCX connections, enhancing reliability and execution consistency. |
4616525 | Addressed a bug where the stop_on_deadlock parameter of the scheduler was not being correctly set to false via the HOLOSCAN_STOP_ON_DEADLOCK environment variable. This fix ensures that the boolean value is accurately set to false when the environment variable is assigned false-like values. |
Known Issues
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
4062979 | When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered. |
4267272 | AJA drivers cannot be built with RDMA on IGX SW 1.0 DP iGPU due to missing nv-p2p.h . Expected to be addressed in IGX SW 1.0 GA. |
4384768 | No RDMA support on JetPack 6.0 DP and IGX SW 1.0 DP iGPU due to missing nv-p2p kernel module. Expected to be addressed in JP 6.0 GA and IGX SW 1.0 GA respectively. |
4190019 | Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run . Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead. |
4210082 | v4l_camera example seg faults at exit. |
4339399 | High CPU usage observed with video_replayer_distributed application. While the high CPU usage associated with the GXF UCX extension has been fixed since v1.0, distributed applications using the MultiThreadScheduler (with the check_recession_period_ms parameter set to 0 by default) may still experience high CPU usage. Setting the HOLOSCAN_CHECK_RECESSION_PERIOD_MS environment variable to a value greater than 0 (e.g. 1.5 ) can help reduce CPU usage. However, this may result in increased latency for the application until the MultiThreadScheduler switches to an event-based multithreaded scheduler. |
4318442 | UCX cuda_ipc protocol doesn't work in Docker containers on x86_64. As a workaround, we are currently disabling the UCX cuda_ipc protocol on all platforms via the UCX_TLS environment variable. |
4325468 | The V4L2VideoCapture operator only supports YUYV and AB24 source pixel formats, and only outputs the RGBA GXF video format. Other source pixel formats compatible with V4L2 can be manually defined by the user, but they're assumed to be equivalent to RGBA8888. |
4325585 | Applications using MultiThreadScheduler may exit early due to timeouts. This occurs when the stop_on_deadlock_timeout parameter is improperly set to a value equal to or less than check_recession_period_ms , particularly if check_recession_period_ms is greater than zero. |
4301203 | HDMI IN fails in v4l2_camera on IGX Orin Devkit for some resolution or formats. Try the latest firmware as a partial fix. Driver-level fixes expected in IGX SW 1.0 GA. |
4384348 | UCX termination (either ctrl+c , press 'Esc' or clicking close button) is not smooth and can show multiple error messages. |
4481171 | Running the driver for a distributed applications on IGX Orin devkits fails when connected to other systems through eth1. A workaround is to use eth0 port to connect to other systems for distributed workloads. |
4458192 | In scenarios where distributed applications have both the driver and workers running on the same host, either within a Docker container or directly on the host, there's a possibility of encountering "Address already in use" errors. A potential solution is to assign a different port number to the HOLOSCAN_HEALTH_CHECK_PORT environment variable (default: 8777 ), for example, by using export HOLOSCAN_HEALTH_CHECK_PORT=8780 . |
Wayland: holoscan::viz::Init() with existing GLFW window fails. | |
4680791 | iGPU: H264 application in the dev container cost more than 1 hour to generate the engine file |
4680894 | AJA driver failed to build with IGX 1.0 GA and Jetpack 6.0 |
4667183 | Holoscan CLI: Failed to extract with Permission error |
4668978 | AJA: Holohub application disable RDMA will get crash in both container and deb |
4678092 | Holohub: Failed to build_and_run the volume_rendering_rx on IGX |
4678337 | v4l2 sample will crash with HDMI input |