We generate rays depending on how many samples we need to render. If we only need to render a single ray, we simply calculate its radiance at the center of the location pixel and return. Otherwise, we sample randomly around the center of the pixel, repeatedly for
num_samples rays. The radiances then get added together and averaged.
The rays themselves are generated by translating the position vectors we receive from the earlier step into camera space, then back into worldspace. They originate from the camera position and the range of possible intersection points is given by the difference between
fClip which governs how ‘far’ the ray travels.
The triangle intersection algorithm is the Möller-Trumbore algorithm. In order to tell if the ray intersects the triangle or not, we calculate some values using the ray origin and direction, and the vertices of the triangles, and we get some values expressed in barycentric coordinates. If the resulting coordinates lie within the triangle (i.e. 0 <= n <= 1) and the
t value is good (
max_t) then the ray intersected, otherwise, it didn’t.
[PathTracer] Rendering... 100%! (18.4198s) [PathTracer] BVH traced 447589 rays. [PathTracer] Averaged 840.995940 intersection tests per ray.
[PathTracer] Rendering... 100%! (0.0488s) [PathTracer] BVH traced 322054 rays. [PathTracer] Averaged 2.476305 intersection tests per ray.
In order to construct the BVH, we have a three step process. Upon entering the construction function, the first thing we do is loop through the primitives and expand our bounding box to cover all the child nodes. Then, we see if we are a leaf by checking the number of children we have: if it’s below the
max_leaf_size, then we simply heap-allocate the vector of children and return the node. Otherwise, we need to split the list of children into left and right child nodes. We decide how to split by figuring out what the longest axis is (a simple loop over the 3 axes), then calculating the average of the children centroids on that axis (another simple for loop over the children). Then we loop over the children again, and all children with centroid values less than the average are assigned to the left child, and greater to the right child. These then get
construct_bvh'd, assigned to the
node->r, and we return the node.
For the intersection algorithm, we first check to see if the ray intersects the bounding box at all, and if it doesn’t, or does but the
t value is out of range, we return false. If it does intersect, and if it’s a leaf, we check whether the ray intersects with any of the children, and if it does, we shortcircuit and return true, otherwise, in the intersection form, we check all the intersecting children and look for the one nearest to the ray origin. If it’s not a leaf, we simply recurse down into the children and see if/where the ray hits among them.
The performance improvement is over three orders of magnitude. Much of this comes from first, not checking ray intersections from the root primitive every time, but rather recursively from the nodes the ray is actually likely to intersect with, based on the bounding box. This results in a dramatic speedup because half the search space is eliminated at every loop iteration.
Hemisphere and importance lighting are implemented as follows:
Hemisphere lighting is implemented by taking a number of samples from a uniform hemisphere model, and then casting a ray in the direction of each of these samples. The radiances are updated each time of these rays intersects the figure.
Importance lighting is done similarly, but instead of sampling from the hemisphere as a whole, we only do so from the light sources in the scene. Rays are cast in directions coming out from the light.
The following two images are rendered with importance and hemisphere lighting, respectively, with the following flags:
-t 8 -s 64 -l 32 -m 6 -r 480 360
As the number of rays increases, the image gets less and less noisy, which makes sense, as we are giving ourselves more information with which to resolve the image.
|1 Rays||1 Rays|
|4 Rays||4 Rays|
|16 Rays||16 Rays|
|64 Rays||64 Rays|
In the importance lightning column, we can see that as the number of rays increases, our ability to resolve soft shadows also increases - however, below 16 rays, there just isn’t enough light for the outer shadows to be reall well-formed, although they are still visible. Above 16 rays, the outline of the shadows just becomes clearer and finer-grained.
The difference between importance and hemisphere lightning becomes quite apparent, however. Even at high ray numbers, the hemisphere model is much noisier. Because the lighting model specifically shoots rays from light sources, it takes fewer rays to resolve an image of equivalent quality to the hemisphere model, which shoots rays from a direction derived by sampling a uniformly distributed hemisphere emanating from the source.
Indirect lighting is implemented in
at_least_one_bounce_radiance. As it’s name suggests, it starts by assigning the output of
one_bounce_radiance to the output radiance accumulator. Then, there is a chance of more bounces being counted - first, if the ray has more ‘bounces’ available, as defined by
max_ray_depth and decremented on every indirect bounce, and then, by some amount of randomness given by a ‘Russian Roulette’ system, where a coin flip decides the fate of the ray. Subsequent rays, (calculated by recursively calling this function) should they survive and intersect the scene, are added to the output accumulator (weighted by some BSDF factors and cosine terms), and then returned.
You can see that the primary shadows are all missing, and that the scene is much darker. The primary lighting is also missing from the top of the spheres.
The ceiling is dark, and all the soft shadows are missing. This is because no reflections of rays are taking place. No reflections of color from the wall can be seen.
m = 0
m = 1
m = 2
m = 3
m = 100
As we increase the samples per pixel, we can see that the resolution of the image increases, though it remains noisy. We can only do so much with 4 light rays, even at 64 and 1024 samples per pixel. At 1024 samples, however, it still looks pretty good.
In order to implement adaptive sampling, we collect statistics on the illumination of the pixel we are currently tracing, while we are tracing it. Once it reaches a particular confidence threshold, we end early, speeding up the rendering process. We collect the mean and variance of the last
radiance we have collected for the pixel, and every
numSampleBatch iterations, we check to see if the
I value meets our threshold at a 95% confidence value, assuming a normal distribution.
It’s not the easiest to see, but from the rate image, it can be seen that the locations with the most soft shadows (the bottom of the balls, and the ceiling near the lights) is the place where the most time was spent rendering. The other areas, well list or with primarily direct lighting such as the top of the balls or the back wall, can get short-circuited by the adaptive sampling relatively quickly.