The rendering pipeline

Real-time 3d rendering typically deals with the problem of bringing a scene, specified as a collection of 3d models, onto the two-dimensional screen such that light, haze, surface structure and similar features are added in a credible way. Thus, there are operations having to do with coordinates involved as well as operations dealing with blending colors (strangely enough, both can be cast in the form of vectors). The whole path from the models to the final pixels on the screen is called the rendering pipeline. Since different elements in the scene can have different requirements, the pipeline is configurable (by flicking certain switches) and programmable (by GLSL code). One complete set of such instructions is typically called an effect, i.e. you can associate different effects to certain parts of the scene.

The preparation stage

The graphics card, or GPU, does only part of the rendering work, there are steps which need to be done in preparation because graphics cards are rather specialized devices with many operations hard-wired (which is what makes them so fast in the first place). So the first stage of the rendering pipeline deals with setting up a description of a scene a graphics card can understand and use.

A scene is typically described as a series of meshes which can be created with e.g. a 3d model editor, and for which OpenGL abstraction layers like OpenSceneGraph (OSG) supply format-specific loading routines.

An example for a mesh - the Space Shuttle and its external tank.

Mathematically, a mesh is a long list of vertices, i.e. point coordinates in 3d space. The mesh also contains the information on surfaces, i.e. between which vertices a triangle should be drawn (a bare vertex list doesn't tell how they are connected). Any surface has an orientation which can be described by the surface normal. Surfaces also can be painted a certain way, i.e. have color information associated with them.

Surfaces are never directly represented in GLSL, which is why all the information on their properties is attached to the vertices as so-called attributes. For instance the color of a triangle is attached to the three vertices at the triangle edges. The normal of the surface spanned by three vertices is attached to each of the corners. And so on.

There can be several meshes in a scene (see e.g. the External Tank and the Shuttle above, which in turn is composed of different parts). They are processed separately, each in its own draw. Basically, a draw is what you define to be drawn together, as long as the Shuttle and the tank are connected, they could be rendered in a single draw, but as they disconnect we can't specify their position by just a single value, so we're likely to use two draws.

For each draws, there are parameters which do not change vertex by vertex - the light source which we want to use to illuminate the scene is a prime example. These are called uniforms and not attached to a vertex like attributes but attached to a draw as part of the state set. Part of the state set is also what shader code to use for a particular draw.

Another example of uniforms are the transformations used to move models around in the scene. Theoretically this could be done by changing all vertex positions in a mesh. However, for rigid-body rotations, it's actually much easier to compute one transformation matrix for the whole mesh, attach it as a uniform and let the graphics card do the moving. This is much faster, because not only is the graphics card a massive parallel device, but it also has hardware support for things like matrix multiplications.

What is passed to the GPU are then arrays of vertices with associated arrays of attributes, bunched into draws with associated state sets.

Setting up a scene efficiently ('scenegraph management') and pre-structuring the data is an art in its own, and is not covered by this introduction. How this part technically looks like depends on what tools you're working with - it might be direct OpenGL instructions in C++, it might be using an abstraction layer like OSG but still in C++, or it might be that (like in Flightgear) you have a very high-level abstraction layer available and all effects can be designed as xml. In the following, it is assumed you have a way of setting up the renderer such that you can load meshes and create effects - if that is not the case, for instance the OpenSceneGraph Tutorials contain sample code which does just that for you.

The vertex shader

The first GLSL-programmable stage of the rendering pipeline is the vertex shader. As the name suggests, a vertex shader has access to a vertex and the associated attribute, in addition to the state set associated with the draw this particular vertex is part of. What the vertex shader does not have access to is non-local information - most importantly it does not know anything of neighbouring vertices, connectivity, surfaces or anything like that.

The reason for this limitation is parallel computation - as long as the vertex shader only processes one bit of the scene a time, the computation can be spread out across a large number of cores - which is part of what makes graphics cards fast. If the computation on one core would have to query other cores what the results of their computation are, it would be substantially slower.

What's expected as output from a vertex shader is at minimum another vertex and optionally properties associated with the vertex (this is a data type called varying for reasons to become apparent below.

The most common usage of the vertex shader is to position objects in the scene. As indicated above, it's computationally cheaper to move an object by passing the transformation matrix as an uniform to the GPU than to change every vertex prior to sending a mesh to the graphics card. The vertex shader thus gets to see the vertex in the model coordinates used in the original 3d modeling tool, applies the transformation matrix and outputs the vertex in the intended coordinates for the scene, from where everything can be projected by another matrix into flat screen coordinates.

The rasterizer (and others)

The next processing steps are also happening on the GPU, but they're not fully programmable, only for some of them options can be set. The most important task is rasterizing - the mesh of vertices and surfaces is translated into screen pixels. For this, the mesh is first projected into a flat sheet perpendicular to the camera, then a grid (the 'raster') is applied to it, with the grid spacing determined by the screen resolution, and then properties are assigned to the individual pixels, or fragments according to the following principles:

If the grid spacing is smaller than a triangle (i.e. a triangle is represented by multiple pixels), then the varying data types defined at the vertex stage are linearly interpolated across the triangle, i.e. a pixel close to a corner almost gets the value at the corner, a pixel in the center of the triangle gets the average of all three corner vertices, and so on.

If the grid spacing is larger than a triangle, i.e. there are many vertices per pixel, the rasterizer just pulls a semi-random sample and uses that triangle to represent the pixel. Since this may lead to artifacts, it is possible to instruct the rasterizer to do multiple samples and take their average in the end (which is however computationally more expensive).

An important part of what is also assigned to a pixel are coordinates, in particular texture mapping coordinates (if defined).

It is possible to run various buffer-based masks after the rasterizer has done its job, for instance a depth buffer mask, an alpha mask or a stencil buffer mask - these can drop pixels before they are passed to the (possibly expensive) final stage. Such masks however apply at this stage only if the fragment shader doesn't alter the tested properties - for instance, for an alpha mask to work, the fragment shader may not change the alpha channel of a pixel, otherwise the mask has to be applied later.

The fragment shader

The fragment shader has access to all varyings interpolated across the triangles for this particular pixel, in addition to the uniforms associated with the draw the pixel is part of. In addition, it can access a special class of uniforms, uniform samplers, which are just a fancy name for textures - which are in turn mathematically just huge lookup tables for particular functions of the coordinates.

Thus, if the texture coordinates have been properly attached and interpolated, in the fragment shader a texture sheet can be referenced to color a pixel. In addition, various color blending and lighting operations can be performed at this stage.

Like the vertex shader, the fragment shader has local information only (for the same reason). In particular, it does not have access to adjacent pixel color values.

The fragment shader is expected to output the most important quantity - and that's the color of the pixel as it appears on the screen, and the fragment shader output is the last word - regardless of what colors and textures are assigned to an object, if the fragment shader does not use them, the screen will look as the fragment shader says.

(Actually, it's not quite true that the fragment shader necessarily outputs the screen pixel - it's also possible to render into a texture and use this texture in a second fragment pass in more complicated setups.)

Continue with Basic GLSL structures.

Back to main index Back to rendering Back to GLSL

Created by Thorsten Renk 2016 - see the disclaimer, privacy statement and contact information.