NVIDIA AR Body Tracker

Utilise the NVIDIA Broadcast SDK on RTX GPUs to generate rigged skeleton poses based on 2D video.

Method #

This node utilises the NVIDIA Broadcast SDK running on the NVIDIA RTX GPUs to pick out human forms from video, then generate and track 3D skeleton of their body. This technology may eliminate the need for physical motion capture suits in some circumstances.

It is only capable of tracking a single body at present. The tracker operates in both 2D and 3D space, but Notch makes use of the 3D data only. For best results, the camera should be as straight on as possible and the input video should be of suitably high resolution; 1080p is ideal.

This node is designed for use with the NVIDIA AR Body Tracker Skeleton, which is able to apply the tracked skeleton data from this node to a 3D motion capture rig.

This node requires the installation of the NVIDIA AR SDK, which can be downloaded here : NVIDIA AR SDK. Make sure you get the correct driver for your GPU, Turing (20XX), Ampere (30XX) and Ada (40XX) use different drivers and they are not cross compatible.

Real-World Alignment #

The 3D skeleton is generated in a space relative to the input camera. The centre of the camera image is considered to be the origin for the 3D skeleton data. It is likely that the bone positions will not match the scene’s world coordinates and it will be necessary to apply a transform to the bones to translate them in space. This can be done on the Skeleton node directly, or by linking a Null to the Skeleton 3D Transform input of this node. In the event the camera being used as the video source is tracked and its location is therefore known, it could be passed as input to this to get the skeleton into world space.

The body tracker is not aware of changes in camera zoom, and is unable to distinguish between cameras zooming in and subjects moving closer. In the case a tracked camera (e.g. via Exposable Camera node) is also used as video input, with changing zoom, and the 3D skeleton must match the real world, the following work-in-progress workflow is available:

Use the NVIDIA AR Body Tracker Skeleton node to visualise the 3D skeleton in space.
Link the tracked camera node to the Skeleton 3D Transform input.
Enable “Auto Adjust For FOV” on the Body Tracker node.
Set the input video camera to a known location and a known, typical FOV value. Input the FOV into the “Reference FOV” parameter.
Adjust the “Reference Distance” until the 3D skeleton matches the video.
Link the camera node’s FOV to the “Current FOV” parameter. As the camera’s FOV changes, the skeleton’s depth will adjust accordingly to maintain lineup.

Parameters

Attributes

These properties control the core behaviours of the node.

Parameter	Details
Preview In Viewport	Preview the effect blended with alpha in the viewport.
Preview RGB In Viewport	Preview the colour values in the viewport.
Preview Alpha In Viewport	Preview the alpha values in the viewport.
Apply PostFX Before Alpha Image (Legacy)	When enabled, post fx applied to the node will not apply to alpha images connected to the nodes input.
Active	Enables or disables the effect. Disabling the effect means it will no longer compute, so disabling a node when not in use can improve performance.
Show Bounding Boxes	Render bounding boxes of detected bodies on screen.
Reset Body When Lost	Reset the body data when a face cannot be found.
Auto Adjust For FOV	Attempt to correct changes in depth caused by changes in input camera zoom.
Reference FOV	The initial fov, used to line up the camera distance.
Reference Distance	The initial distance, used to line the camera up to the initial fov. this parameter must not be animated as doing so requires a reset of the tracker.
Current FOV	The current fov, as this value changes, the depth will be updated.
Focal Length	The current focal length.

Inputs

Name	Description	Typical Input
Skeleton 3D Transform	Applies a transform to the skeleton bones in 3D space.	Null
Effect Mask	Mask out areas that Post-FX applied to this node won’t be applied.	Video Loader
Alpha Image	Use a seperate video nodes luminance values to overwrite the alpha channel of the image.	Video Loader