7.16 Optimizing Renderings
3Delight is heavily optimized: only "state of the art" algorithms are used in every area and continuous re-evaluation of every functionality against the most recent techniques ensures that every release is a step forward in terms of speed and memory consumption. This doesn't mean that no user guidance is required, to the contrary, carefully chosen options and scene layouts can speed up renderings enormously. The following sections give some important advices to that matter.
Rendering Options Tuning
Be careful to select the right options the render: the fastest option settings without a quality tradeoff.
- Format (image resolution)
- Image size is directly proportional to rendering time, that is why it is very important to select the right resolution for the render: lower resolution means less used memory and faster renders.
- PixelSamples and PixelFilter
- More pixel samples means slower renders. A "
PixelSamples 6 6
" is more than enough for final renders, even with motion blur. Using the high quality "sinc" filter requires a wider support ("PixelFilter "sinc" 4 4
" or more) that is why it is generally slower. - ShadingRate
- Test renders should specify "
ShadingRate 4
" or more. Final renders should use "ShadingRate 1
". Smaller shading rates are rarely needed and are often used by shaders that produce high frequencies(62). - Bucket and Grid size
- In our experience, a bucket size of 16x16 and a grid size of 256 is a very good general setup for 3Delight. This can be bettered by doing experiments on a particular scene. Also, a given grid should cover approximately one bucket in raster space: so for a shading rate of 1 and a bucket size of 32x32 the grid size should be 1024.
- Horizontal or Vertical ?
- By default, the renderer issues buckets row per row. This is not the optimal bucket order if the image contains objects that are thin and horizontal. Use "Option "render" "string bucketorder" "vertical"" to specify a vertical bucket issuing strategy (column by column). This trick is likely to save memory only, impact on speed is negligible in general.
Multithreading
Use multithreading or multiprocessing when possible. Depending on the nature of the rendered scene, performance improvements could be linear with the number of processes used. Refer to Performance and Implementation Notes for details.
Ray Tracing
Avoid ray tracing, if possible! Ray tracing is inheritantly slow since it implies some intensive number crunching. Most importantly, it forces 3Delight to retain geometry for the entire render, meaning higher memory usage. An exception to the above is ray traced shadows in small to medium sized scenes, which in our experience render very fast. Here is a few advices concerning ray tracing:
- Use
Attribute "visibility"
to specify which objects are visible to traced rays, minimizing the number of "ray-traceable" objects is beneficial for both speed and memory usage. Additionally, using simpler objects ("stand-ins") for ray tracing is a good idea, those should be made invisible to camera rays. - Limit maximum ray depth using
Option "trace" "integer maxdepth"
. A value of `2' is enough in most cases. Remember that increasing this limit may slow down the renderer exponentially. - Use grouping memberships to ray trace only a subset of the scene; this helps only for large scenes.
- Use shadow maps instead of ray traced shadows. Those are faster to generate and are re-usable. Colored shadows can be rendered efficiently using deep shadow maps. See section Shadows.
- Use environment maps or reflection maps to render reflections. In most cases, the viewer won't see the difference.
- Use image based lighting instead of brute force global illumination. See section Image Based Lighting.
- If using ray traced shadows, use the `opaque' or `Os' transmission modes, this avoids costly shader interrogations.
- Shaders triggered by a secondary (ray traced) rays should use a less expensive code. Secondary rays can be detected using the
rayinfo()
shadeop. See rayinfo shadeop. - Use true displacements and motion blur only when necessary (in the context of ray tracing).
Textures
It is well known that texture access eats up a significant amount of rendering time. Textures are also known to be great resource hogs.
- Do not use oversized textures or shadow maps. Appropriate texture or shadow map dimensions depends on final image resolution and should be adjusted by performing experiments.
- Use compressed textures, it pays off in most cases. Two benefits are less disk usage and less network traffic.
tdlmake
has several options for texture compression, see tdlmake options. - Use "box" filtering if possible (see texture shadeop). The "gaussian" filter gives nicer results in general (since it is a high-quality anisotropic filter), but in many cases (such as on smooth textures) the "box" filter can be used without any visible quality penalty.
- Increase texture cache memory. This is particularly important when using deep shadow maps since one tile of DSM data takes much more space than a tile from a simple texture.
- Use the network cache, as described in Network Cache. When rendering over a network, caching textures locally can save a substantial amount of time. We timed speed gains of 15% and more with heavy production scenes on very large networks (more than a hundred of machines).
Geometry
- Use higher order surfaces. 3Delight runs much more efficiently when provided with high level geometry such as NURBS and Subdivisions. Feeding finely tessellated polygons to the renderer is not recommended since it is non optimal in many ways. In addition, using curved surfaces gives most certainly nicer results (no polygonal artifacts).
- Use procedural geometry whenever possible; this can have a considerable impact on rendering speed, especially in complex scenes. 3Delight loads only a procedural object if it is visible and disposes it after use, this is beneficial for both speed and memory usage. More on procedurals in Procedural Primitives.
- Use level of detail in complexe scenes. More on level of detail at Level of Detail.
Shaders
The "trend" is toward sophisticated looks, complex illumination models, anti-aliased shaders and volume shaders. This is why shading can eat up a non negligible slice of the total rendering time. It is really worth the effort to carefully optimize shaders.
- Use
uniform
variables whenever possible. Remember that 3Delight's shaders run in SIMD, which means that an operation processes many shading samples at once.uniform
variables are computed only once per grid contrary tovarying
variables which are computed once per micro polygon. Consider the following example:surface test( float i_angle = PI * 0.5; ) { float half_angle = i_angle * 0.5; ... more code goes here ... }
There is a "hidden" overhead in this shader since i_angle isuniform
and half_angle isvarying
(63). This means that half_angle is initialized to the same value (i_angle/2) for each micro polygon on the grid. Although the shader compiler optimizes somevarying
variables intouniform
, declaring half_angle asuniform
guarantees to avoid unnecessary operations. - Avoid using conditionals in shaders. Conditionals can stall 3Delight's SIMD pipeline which has an impact on shaders' execution speed.
- Use light categories.
Evaluating light sources is a non negligible part of total shading time. Light categories can help avoid evaluating light source shaders which are not needed in some particular context. A very good example is an atmosphere shader that uses a "ray marching" algorithm to compute volume's translucency and color: for each sample along the marched ray
illuminance()
is called to compute incoming light, which triggers the evaluation of all attached light sources. To limit shader evaluation to only a subset of lights which effectively contribute to atmosphere, one can use the following:/* Only evaluate light sources that are part of the "foglight" category" */ illuminance( "foglight", currP ) { ... compute volume attenuation+color here ... }
Listing 7.16: Using light categories
Note that using message passing such as in the following example is not the right way to reject light sources from illuminance loops.illuminance( currP ) { uniform float isfoglight = 1.0; lightsource( "isfoglight", isfoglight ); if( isfoglight == 1.0 ) { ... compute volume attenuation+color here ... } }
Listing 7.17: Erroneous use of message passing
In the erroneous example, light sources are evaluated but not included in the result, leading to little or no time gain. Light categories insure that light sources are not evaluated, saving shader execution time. It is not uncommon to have tens of light sources in a scene that is why it is important to use light categories, especially in volume shaders. - Compile shaders with `-O3'. Default option is `-O2' which does not include some useful optimization.
- Compile shaders into machine code with `--dso'. This pays off with very expensive shaders, such as atmosphere shaders.
- Look for expensive shaders by using the
"__CPUtime"
output variable. It is possible to output an image where the luminosity represents the cost of shading by using aDisplay
statement such as:
Display "+cputime.tif" "tiff" "__CPUtime"
.
Objects which are more expensive to shade will be brighter. You can use this image as a guide to which shaders might be worth simplifying.
3Delight 10.0. Copyright 2000-2011 The 3Delight Team. All Rights Reserved.