Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile iMdl tile format with 3D Tiles spec #7311

Open
4 tasks
pmconne opened this issue Oct 31, 2024 · 2 comments
Open
4 tasks

Reconcile iMdl tile format with 3D Tiles spec #7311

pmconne opened this issue Oct 31, 2024 · 2 comments
Assignees
Labels

Comments

@pmconne
Copy link
Member

pmconne commented Oct 31, 2024

Problem

3D Tiles is an open standard developed to solve the problem of efficiently streaming massive geospatial models over the internet for rendering. It encodes the content of each tile using another open standard, glTF. Both standards have been widely and enthusiastically embraced across several industries.

Bentley Systems is no exception: we use 3D Tiles to encode and stream photogrammetry meshes, point clouds, and subsurface models. However, our engineering content - represented as iModels - is not visualized using 3D Tiles. Instead, we use a 3D Tiles-inspired format of our own, called "iMdl". Several factors motivated the invention of a non-standard format, but they generally fall into two buckets:

  • Encoding data needed to enable specific rendering features; and
  • Optimizing the size and layout of the tiles for efficient decoding and rendering.

Our users want to enable integrations of iModel content into workflows involving game engines and other third-party renderers. The iMdl format presents a barrier to those integrations, because only iTwin.js understands it. The mesh export service now supports generating tilesets from an iModel in both iMdl format and 3D Tiles format, but much of the data encoded into the former is omitted from the latter.

iMdl optimizations

The iMdl format has been heavily optimized to minimize tile size, and to encode the data in a format that can be directly uploaded to and processed by the GPU using iTwin.js' shaders. These optimizations include things like 24-bit integers, "vertex tables" for eliminating redundant vertex attributes, and a binary feature table layout with multiple levels of indirection. We may be concerned that switching from iMdl to 3D Tiles will result in dramatically larger tiles.

These optimizations certainly make a difference in tile size. On average, given the same iModel and tile structure, an iMdl tileset is 70-80% of the size of the equivalent 3D Tiles tileset. (The discrepancy is actually larger than shown, since the iMdl tiles include edge visibility while the 3D tiles do not).

iModel 3D Tiles iMdl Ratio
04_Plant 19.5mb 13.6mb 70%
LargePlantNew 744mb 543.5mb 73%
AnimatedCastle 1.6gb 1.1gb 69%
refinery 1.5gb 1.2gb 80%

However, applying meshopt compression to the 3D Tiles reduces their sizes by an order of magnitude.

iModel No meshopt meshopt
04_Plant 19.5mb 3.2mb
LargePlantNew 744mb 196.6mb
AnimatedCastle 1.6gb 343.1mb
refinery 1.5gb 323.5mb

When gzipped, the meshopt-compressed tiles are approximately two times smaller than the uncompressed tiles.

Recommendation: Encode using standard 3D Tiles, and apply any optimizations in the decoder.

iModel rendering features

The following features of the iTwin.js display system depend on data encoded into iMdl tiles with no clear analog in glTF. If we switch from iMdl to 3D Tiles, we will need alternate ways to convey the required data to the decoder/renderer.

The proposal is to produce 3D Tiles meeting two criteria:

  1. When consumed by any spec-compliant decoder and renderer, present a close approximation of what the corresponding iMdl tiles would look like in a standard iTwin.js view (i.e., a shaded view with edge display disabled and materials enabled).
  2. Encode additional data via glTF extensions that can be leveraged by iTwin.js and any other decoder and renderer supporting the extension to provide enhanced visualization features. Each of these extensions should enable one discrete feature, usable independently of the other extensions.

The first criterion implies that basic rendering of the 3D Tiles must not depend upon any new extensions. So, for example, we do not propose a "24-bit integer" extension, because that is an optimization, not a feature, and it's used for crucial data like triangle indices. (It is also unlikely to gain adoption since it doesn't map directly to a native data type). But we do propose an extension to encode edge visibility, because an iModel can be visualized effectively whether or not its edges are drawn.

The extensions are proposed as the minimum required to fulfill the second criterion. We should consider generalizing them to be potentially applicable in wider contexts and/or to align more closely with existing standards (whether within the glTF ecosystem or outside it, e.g., in USD). Only a brief description of each is provided; we will flesh out and formalize their individual specs once the overall proposal is refined and agreed upon.

Polylines

Engineering models frequently contain large numbers of line strings. iMdl supports the concept of a "polyline" - a single geometric primitive consisting of any number of line strings, each of any length. Polylines can be rendered with a width of up to 32 pixels and/or a dash pattern. Since WebGL does not support drawing wide lines natively, each segment must be tesselated and rendered as a quad. Additional triangles may need to be inserted between individual line segments to round out the corners. Applying the dash pattern requires keeping track of the distance along a line string at any given point. iMdl format encodes the tesselated polylines directly along with their width and dash pattern.

Image

WebGL and glTF provide three types of line primitives: LINE_STRIP and LINE_LOOP represent a single contiguous line string, while LINES represents any number of line segments with no encoding of the logical connectivity between segments. It would be possible to produce a separate LINES primitive for each line string within each polyline, but this could significantly increase the number of draw calls.

Proposed solution

Encode the polylines as glTF LINES. Define an extension for primitives of this type that encodes the index of the last vertex index of each individual line string within the primitive. The decoder can use this information to tesselate the line strings as it sees fit.

Separately, define a "line material" extension applied to a primitive of any TRIANGLE* or LINE* type (or to a material?), encoding the integer line weight and the dash pattern as an unsigned integer bitmask where each 1 in the bitmask corresponds to a lit pixel in the pattern.

Point size

Infrastructure engineers use point primitives for a variety of purposes. iMdl permits a point size in pixels to be specified for each collection of points. The iTwin.js renderer draws circular points of the specified radius.

glTF currently does not provide a way to specify point size.

Proposed solution

Apply a "point size" extension to primitives of type POINTS, encoding the point radius as an integer.

Edge visibility

Infrastructure engineers often want to render the edges of surfaces when visualizing their designs. This is different than wiremesh, which renders the edges of every triangle; instead, only the logical edges are drawn. Logical edges fall into two categories: "hard" edges which are always visible (e.g., the circular outline of the end of a cylinder), and "silhouettes" which are only visible when viewed from certain angles (e.g., the edges running along the length of a cylinder). A silhouette is rendered if the faces on either side are facing in opposite directions relative to the camera.

Image

Some edges may be encoded separately as polylines, if they need to be rendered wider than one pixel and/or with a dash pattern. This is particularly important in 2D views.

The CESIUM_primitive_outline glTF extension specifies hard edges only, as pairs of vertex indices describing line segments. It only applies to primitives rendered with "mode": 4 (TRIANGLES).

iMdl encodes edge visibility as a bitmask - two bits per triangle index indicating the visibility (hard, silhouette, or none) of the edge between that vertex and the next. It also serializes the (oct-encoded) normal vectors for each pair of faces adjoining each silhouette edge.

Proposed solutions

We could define a new "edge visibility" extension applying to primitives of render mode TRIANGLES, with the same encoding as in iMdl. Longer term, we would probably want to generalize it to support TRIANGLE_STRIP and TRIANGLE_FAN, and permit alternate encoding of the normal vectors.

Or, we could extend CESIUM_primitive_outline to add support for silhouette edges as pairs of indices and normal vectors. This would be a more verbose - but perhaps simpler - representation.

In either case, also add a representation of polyline edges, probably encoded as two buffers: one holding the vertex indices, and another holding the index of the last vertex index of each line string.

There is apparently some formal specification related to edge visibility in USD. We should look at it.

Materials

Infrastructure models emphasize engineering precision over aesthetics. Pattern maps and normal maps are relatively uncommon depending upon the domain, and even when present, they are often toggled off for certain workflows. So the concept of a "material" in an iModel does not map directly to the glTF definition.

Every piece of geometry in an iModel has a base color and transparency. Every surface may optionally also have a simple Phong material that can override the color and/or transparency, specify specular properties, and define a pattern and/or normal map. iTwin.js renders the geometry using its base appearance or (if present) material depending on settings like the viewport's view flags.

For each vertex, iMdl encodes both the base color and the material index. Geometries that use different, non-textured materials can be batched into a single primitive; the material index specifies the index of a vertex' material within the primitive's material atlas. Geometries that use different textured materials cannot be batched.

glTF defines materials using PBR. Each primitive can be assigned only one material. The material baseColorFactor, baseColorTexture, and the primitive's color attribute are all multiplied to determine a vertex's color.

Proposed solution

Long-term, we want to switch to a PBR material representation in iModels. However, this would require changes to many connectors to propagate PBR materials from the source data. Moreover, even after such a switch, we would still require the ability to toggle at display time between the base appearance and the material.

Encode glTF materials providing a "close enough" PBR representation of the Phong material, at minimum preserving color and pattern/normal map.

Define a "Phong material" extension to be applied to a glTF material, providing additional properties like specular color/exponent.

If the material overrides the geometry's base color and/or transparency, the primitive's color attribute should encode the overridden colors, not the base colors. In that case, define a secondary color attribute (or texture + UVs) encoding the base colors.

Constant-LOD texture mapping

Constant level-of-detail texture mapping is a technique that dynamically calculates texture coordinates to keep the texture image close to a certain size on the screen, thus preserving the level of detail regardless of distance from the camera. It is useful for patterns like soil or seafloor, as illustrated below. It can be applied to both pattern maps and normal maps. iMdl encodes the parameters as part of the material. glTF provides no equivalent.

Image

Proposed solution

Define a "constant lod texture mapping" extension applicable to textureInfo and normalTextureInfo, encoding the required parameters.

Fill flags

"Fill" refers to the interiors of closed planar regions in the context of a wireframe view (including any 2d view) which by default renders only edges, no surfaces. Fill flags provide ways to customize how the fill is rendered. For example, drawing models often include text with a filled rectangular bounding box of a contrasting color for emphasis; in this case, the rectangle should draw behind the text. Or, polygons may be used to "blank out" portions of the model by always being filled by the view's current background color.

iMdl encodes the fill flags as part of the material.

Image

Proposed solution

See Z-fighting mitigation.

Z-fighting mitigation

Z-fighting is a common visual artifact in any 3d renderer when multiple geometries try to occupy the same plane. iTwin.js provides several mitigations of this effect in common engineering workflows, including preventing edges from z-fighting with their surfaces and preventing text from z-fighting with its background fill. Another common case involves "sketching" planar geometry onto 3d surfaces, with the intent that the planar geometry should draw "on top of" the non-planar geometry.

iMdl encodes an "is planar" flag for each primitive in a tile. Many planar geometries, each occupying different planes, may be batched together into a single primitive, so planarity cannot be inferred from the primitive's bounding box.

Proposed solution

Apply a "planar primitive" extension to any primitive to indicate that the primitive contains only planar geometry. For triangle primitives, it can additionally specify the fill flags.

View-independent geometry

Infrastructure models sometimes include geometry that should always be rendered facing the camera - for example, text displayed at intervals along the length of a highway indicating distance at that point. iMdl encodes this per-primitive as an optional "view-independent origin" - the point about which the geometry should rotate to face the camera. This is similar to billboarding, though billboards are often constrained to rotate only in the X and Y axes.

Below, the cogs rotate about their center points to face the camera; the dashed line connecting them does not rotate.

Image

Proposed solution

Evaluate applicability of the draft KHR_billboard glXF extension.

Alternatively, apply a simple "view-independent origin" extension to the primitive, encoding the 3d point about which it should rotate to face the camera. Eventually, we may want to enhance it to support constraining in which axes it should rotate.

Next steps

  • Obtain agreement on the overall proposal and the specific set of extensions, making adjustments in response to feedback.
  • Formally define each extension.
  • Enhance TilesetPublisher to output the new extensions.
  • Evaluate whether it would be worth enhancing iTwin.js to decode the 3D Tiles with the new extensions to prove the concept and test the implementation.
@pmconne pmconne self-assigned this Oct 31, 2024
@pjcozzi
Copy link

pjcozzi commented Oct 31, 2024

@pmconne great to see the open standards adoption and contribution.

I'll reach out to @rudybear to see if we can present these potential glTF extension directions with the Khronos 3D Formats WG for early input.

Meanwhile -

I would suggest once we have more guidance that we start new issues in the glTF repo for each proposed extension

@lilleyse anything you would add?

@pjcozzi
Copy link

pjcozzi commented Nov 2, 2024

For polylines in glTF, also adding PixarAnimationStudios/OpenUSD-proposals#42 by @nvmkuruc as a reference here to ideally promote a reasonable OpenUSD <-> glTF interop path. See the proposal README.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants