Building a fast tile engine — pt. 2

Last week, we reviewed the initial approach to drawing tilemaps in Jaakan:

  • Use one sprite per tile
  • Cache each layer to a big texture

This week we’re going to expand on a smarter technique to display a lot of tiles in one fell swoop.

Boundaries

To make something fast, first we must understand how it works, what is slow, what is cheap, etc. In the case of graphics, as often, we can reason in terms of boundaries

On the left, we’ve got all of the game’s code, and some native libraries. They execute on the CPU. Apart from some odd bits like input handling or sound processing, everything runs in a single thread, one instruction after the other.

On the right, we’ve got a power-house of a different kind: even cheap integrated chipsets nowadays have a lot of computing power available, but it works in a different fashion: lots of operations run in parallel. A GTX 560, for example, has a whopping 336 cores, which is a far cry from modern CPUs which even recently, barely exceed 8 cores (although with higher clock rates).

Keep in mind this is all a very simplified explanation of what’s really going on: my own lack of knowledge prevents me from going into too much detail, and it would really be beyond the scope of this article. What we can infer from the diagram above, though, is that much like it is expensive to do back-and-forths between compiled and interpreted code, it’s also relatively expensive to transfer data to and back from the GPU.

What kind of data can we store on the GPU? The two main types of semi-permanent objects we can store there are:

  • Geometry: via Vertex Buffer Objects (VBOs)
  • Textures

As we saw last week, geometry is just mostly a bunch of triangles, however it is packed and whether or not it has texture coordinates, colors, or whatever kind of attributes we want each vertex to have.

As for textures well, they’re images. Most of the time. In fact, nothing prevents us from storing something else in textures. They’re just a grid of values, usually 4 bytes per value (for Red, Green, Blue, Alpha RGBA) — again, a simplification, there are actually many different texture formats.

Data storage

Let’s go back to Jaakan for a bit: what do we actually have in our layers? Remember, we want to merge both polygon and tile layers, so for each tile we have to store:

  • Its coordinates in a tile atlas
  • Its shape

As for coordinates, we can assume that 256x256 tiles (ie. 65536) should be enough for the whole game. We can store those tiles in a single 8192x8192 texture. Since we have multiple tilesets, nothing prevents us from packing it in to a single megatexture, much like dye does for fonts.

As for map data, we can use 1 byte to store the row in the tileset megatexture, and 1 byte to store the column. That leaves us two free bytes.

Regarding the shape, we have 8 control points, like we’ve seen in weekly update 15. Each can be either on or off, so we can easily store that in a single byte: every control point correspond to 1 bit, false for off, true for on.

And there we go, storing a full layer’s worth of tile data with 1 byte to spare:

That gives a 40x22 texture that looks a little bit like this:

full

Fragment shaders

Storing map data in a texture means we can send it to the GPU in one go, and since most layers are static, that’s very good — but it doesn’t tell us how we can make use of it to display our actual tiles.

One possible approach (which isn’t new), is to use a fragment shader, and sample both our “map data texture” (to retrieve tile information), and our tile atlas, to draw the right color.

See, fragment shaders allow us to decide what color is drawn at any given pixel:

This diagram is taken from the wonderful open.gl website, which you should go read right now if you’re curious about how modern graphics pipeline work.

So, for each fragment, what we need to do is:

  • Figure out which tile of the map we’re drawing
  • Sample the map data texture to get the column & row info
  • Figure out where in the tile we are
  • Sample the tileset from the column & row and the offset inside the tile to find the final color

Here’s a naive implementation of it in GLSL, OpenGL’s shader language:

And just like that, we can display a full layer with only one quad, one draw call, one tile atlas, and one map data texture:

full

(Astute readers might have noticed some tiles are plain, without numbers on them: that’s because the tile atlas is 16x16 tiles, but the randomly generated map data texture refers to tiles outside of these boundaries. Since the texture is set to GL_CLAMP, it only displays the color of the texture’s border. Kudos if you caught that!)

Adding polygons

I debated making another part for that, since this article is getting long, but it’s really just the same trick again. One can think of polygon shapes as masks for the tiles. There are 2^8 = 256 possible variants (some of them are degenerate, as we shall soon see), so it’s easy to precompute all possible polygon shapes in code, and store it in a texture in memory, as shown here using Apple’s OpenGL profiler

full

Then it’s just a matter of reading the blue channel of the map data texture, sampling the polygon atlas, and multiplying the color we got from the tile atlas with the color from the polygon atlas, which will keep all pixels inside the polygon and eliminate (make transparent) all pixels outside of it.

And that’s really all there is to it. Updating the map is a simple texture upload, and even updating it once per frame is really, really smooth:

Stay tuned for part 3, where we’ll tackle the final challenge in displaying jaakan maps in an efficient way. Don’t forget to share and follow us on Twitter to stay up-to-date with the latest developments!