Thursday, March 28, 2013

Uniformly distributed points on sphere

Sooner or later everybody will need uniformly distributed points on a sphere. There doesn't seem to be a standard method for doing this, so I wrote a very simple iterative algorithm that pushes verts away from each other while continuously normalizing the point data. This will eventually find a stable state where the distance between any two neighboring points are very similar. Performance is terrible but it gets the job done, so only use this for offline stuff. There is also an option for distributing points on a hemisphere (y>0). Set the number of iterations to at least the number of input points for a good distribution.

Source code: uniformpoints.cpp



Sunday, March 24, 2013

Convex Hulls Revisited

I have written about 3D convex hull generation here before. I find it a very appealing problem because it is so well defined and to my knowledge there is no de facto standard algorithm or implementation. I come back to this topic every now and then since I need a good implementation myself.

Quickhull is probably the most popular algorithm, but it is hard to implement in a robust way. The qhull implementation has a somewhat questionable license and more significantly it is a really complex piece of software and contains a bunch of other features. I'm on a quest to create a fast, robust convex hull generator that is free to use and is self-contained in a single cpp file.

I'm currently experimenting with an algorithm based on the support mapping, often used in physics and collision detection. The support mapping for a point cloud for a given direction is the point that is farthest in that direction, which simply means finding the point with maximum dot(dir, point). The supporting point for any direction is guaranteed to be on the convex hull of the point cloud. Very convenient.

The algorithm starts with the smallest possible closed mesh - three vertices connected by two triangles facing in opposite directions. Any three points will do, as long as they are on the convex hull (supporting vertices). Each triangle is then exanded by the supporting vertex in the normal direction. This is done by splitting the triangle into three triangles, all connected to the new vertex. This expansion step may cause the mesh to become concave, so the expansion step needs to be followed by an unfolding step, where convave edges are "flipped" to make the mesh convex again.

Flipping one edge may cause nearby edges to become concave, so this step needs to be repeated until all edges are convex, effectively "untangling" any wrinkles introduced by the expansion. Below is a short clip visualising the construction of a hull through a series of expansion and unfolding steps. For clarity, there is only 12 points and they are all on the convex hull.



The interesting thing about this method is that it is based primarily on topology. Both the expansion and the unfolding step guarantee that the mesh is kept well-defined and closed, so there are no degenerate cases. The only critical computation is how to determine wether an edge is convex or not. I'm still investigating the most robust alternative here. My current one does not deal with all degenerate cases, but I'm pretty sure this can be done.

The algorithm has a number of desirable properties:
  • Can be used with a tolerance in the expansion step for automatic simplification.
  • Relatively easy to implement (mine is about 400 lines of code).
  • Handles co-planar and degenerate input data.
  • The output mesh is built entirely from the input vertices.
  • Can be modified to use any support mapping function, not just point clouds.
I'm not entirely sure this is a novel idea. I'd actually be surprised if it is, given its simplicity, but I haven't seen any references to it before. Please write a comment if you know. I'll get back with more details and a performance comparison later.

Friday, March 8, 2013

Mediocre properties


I think most games use some kind of property system as a way to expose, edit and serialize parameters for game objects. There are of course very many ways to implement it, but here is how I'm currently doing.

Each game object that is big enough to carry properties has a PropertyBag instance. The types of properties should ideally be setup per class, not per object, but that requires extra code and I always strive for keeping the amount of code to a minimum. Hence, the construction of a property bag is done in the object constructor and might look like this:

mProperties.add("density", "1.0");
mProperties.add("color", "1 0 1");

Yes, those are strings. Most game developers don't like them. I do, and I will explain why later. Now, setting up properties for every object this way is both time consuming and memory intensive, and there will most likely be lots of instances with exactly the same property configuration. Therefore, property configurations are cached on a per class class basis:

GameObject::GameObject()
{
if (mProperties.init("GameObject"))
{
mProperties.add("density", "1.0");
mProperties.add("color", "1 0 1");
}
}

So the property definitions are not actually stored in each object, but merely a pointer to a definition which is created by the first object using it. This of course reduces memory overhead by magnitudes. The system also handles inheritance, so a derived class can add more properties to an existing definition, but the amount of data per object to store the whole definition is only a single pointer. In a derived class, the parent constructor will run first, adding properties and then each derived class adding their own in order. The init method detects this by being called several times with different strings.

Now that the property bag is configured I can start reading and writing properties using various get/set methods:

float density = mProperties.getFloat("mass");

This will convert the string "1.0" to a float and return it. If the string isn't numerical, it will still return a valid number (0.0), but issue a warning. I also have some operator overloading to allow immediate conversion from most basic types:

float density = mProperties["mass"];

The system accepts any type conversion on the fly:

mProperties["mass"] = 2.0f;
string str = mProperties["mass"]; // str is now "2.0"

This is all very flexible, but performance is somewhat questionable. I typically cache any property that is used every frame in a member variable. Therefore each game object has a loadProperties() method that gets called whenever something changes. This gives the object a chance to cache a local copy of each performance critical property:

GameObject::loadProperties()
{
mDensity = mProperties["density"];
}

So how much memory is being used? As long as an object doesn't override the default value, nothing is stored per object, hence adding more properties to a class doesn't make objects bigger unless you cache a local copy per instance for performance reasons. The system also supports template values, so in addition to default values, each object can also be assigned a certain template (another pointer). Templates are defined in a simple XML file with key/value combinations that have no knowledge whatsoever about the object type. The same template can even be assigned to property bags of totally different types if wanted.

For instance an object that want to use a more slippery form of rubber can use the rubber template, but override the friction property explicitly and only the new friction value will be stored in the object.

Using strings have two obvious problems: a) It is slow and b) Loose bindings are sensitive to typos. The first issue is seriously overrated IMO. Comparing strings is not as slow as most people claim, especially when using a custom string object with built-in storage. In most cases you only need to compare the first letter of two strings to detect a mismatch. The second issue is more worrying, but the system can easily issue warnings when asking for properties that do not exist. Then at least you will be notified of typos at runtime.

There are several benefits that I think clearly outweigh the negatives. The ability to serialize an object into XML/JSON/YAML/etc using solely the property bag is one of them (the entire level loading/saving code is 50 lines of code, and there is no additional serialization code per class), automatic property editing from a level editor is another. The editor doesn't need to know anything about what is being edited, it just presents all the properties of an object and their values as strings (another 50 lines of code to edit any property of any object in the game). Keeping the property names and their values as strings also allow for trivial scripting integration. We use Lua and have one very useful function to query any property: mgGet("name.property") where name is the object name and property is the property name. The result is always a string, which can then be converted to whatever type seems fit (not even 50 lines of code to access any property of any object in whole game from script).

One improvement I would like to make is how the property values are stored internally. Currently I store them as strings regardless of what they represent, but a more efficient way would of course be to store them in binary form, based on what they represent. The format can be chosen either during initialization or at every set operation.

I currently consider the property bag a blueprint of the object rather than a direct mapping of the object internal state. Hence, if an objects internal state is updated at runtime I don't update the properties at that point. That way I can easily reset an entire level to the state it was at load time. There are benefits of keeping a direct mapping too, so I don't really have a strong opinion as long as it's consistent. A direct mapping certainly puts the property system under more stress. Anyone who tried?

Thursday, January 10, 2013

Development environment


A lot of people have asked about our development environment so I thought I'd write a post about it here. Both Sprinkle an Granny Smith as well as our two titles in development follow roughly the same pattern. I do all day-to-day development and testing on Mac and PC, compiling to native Win32/MacOS applications. Hence the code is cross-platform and mouse input is used to emulate touch input. My preference for actual coding is Visual Studio due to its superior debugging and Intellisense. On the Mac I use Textmate, shell scripts and makefiles. I only use XCode for iOS testing and deployment. I have used the XCode debugger on a few occasions, but it very rarely manages to do any meaningful on-target debugging. This might be due to my broken project files though.. I run Visual Studio through VMWare on the Mac as well, so it's actually only one physical Machine. I used a separate development PC laptop before, but the VMWare solution is far superior in almost every aspect and removes a lot of clutter.

I'm not using any off-the-shelf game engine, but rather a collection of homebrew classes that work well together, so you could say the "engine" is developed specifically for each title. Examples of such classes are strings, streams, input handling, data containers, vector math, scripting, sound, etc. It does not contain any code for actual game objects, rendering pipeline, update loop, etc, so it's rather low level. I personally think this is better than using a game engine in most cases, since you can churn out more performance, and you never hit any artificial boundaries of a third party solution.

I do use certain libraries for specific tasks, such as zlib, TinyXML (just switched to RapidXML), Lua, Box2D, Clipper, etc, but it's very isolated pieces of software that do one thing well. If they wouldn't work for a project they can easily be replaced individually. All libraries are included in the project as source code, no static or dynamic libraries.

The build system is a python script that scans the source tree and outputs a makefile, Visual Studio project or Android ant files. I have not yet tried XCode project file generation because XCode project files (dirs actually) are totally horrible. To be fair Visual Studio project files are horrible too. Actually I don't think I have ever seen an IDE with a sane project file format. Is it really that hard? I should probably try cmake, but it takes time to learn, and doesn't necessarily update when you upgrade your IDE. I generally think that about a lot of things - learning a tool or middleware is often more of an investment than just doing it yourself. Besides, if you do it yourself you gain 100% insight into the inner workings and can fix problems immediately when they show up instead of communicating with support and/or wait for a fix. Anyway.

The desktop binaries read raw assets (XML, jpeg, png, etc) directly from the data folder for convenience while both iOS and Android require asset conversion. The asset conversion script is quite similar to the build system - it scans a directory tree and outputs a makefile. Make will then automatically keep track of assets that need conversion using the file modification date, and it can branch out on multiple threads (using the -j switch) for increased performance. Make is really just as awesome for converting assets as it is for compiling source code. On iOS some images are compressed using texturetool and some are using our custom format called MTX, which is essentially just a way to compress PNG images into JPEG with a separate, compressed alpha channel to save space. The Android version also uses MTX compression as well as general LZ compression on most data files. The APK itself is not compressed so generic asset compression is more important here.

The Android version uses the NativeActivity system available from Android 2.3, so basically the whole game, including setup code is C++. We do have a very thin layer of Java to handle in-app billing and query device capabilities, etc. It's quite remarkable how much of the code is identical between the iOS and Android version. It's really just the setup code, touch input and audio back-end that are completely different.

Sprinkle and Granny Smith both have built-in level editors that are accessible on the Mac and PC version. All level design and graphic assets are done within that editor (except for textures, which are done in Photoshop and Illustrator). Scripting is done in Lua using a regular text editor. That's about it.

Saturday, December 22, 2012

Small Object Allocator


Prior to the latest update of Granny Smith, we ran into memory problems when loading levels. This was due to the unfortunate choice of using XML and TinyXML for level data. I'm generally very positive to both XMl as a data format (as opposed to most game developers) and TinyXML as a library. However, since we don't use instancing, a level can easily be 500 kb or more and TinyXMl will parse the entire file into a DOM structure before it can be read. This results in a myriad of small allocations ranging from a few bytes up to about 100 bytes. I'm not sure exactly what the per-allocation overhead is on iOS and Android, but I suppose it's at least 16 or 32 bytes. In addition the LUA library also makes a fair amount of small allocations.

So far I had not bothered using a custom allocator for anything since I usually try to avoid allocations alltogether. Strings, dynamic arrays, stream buffers, etc, all have a templated built-in storage, so you can reserve a specific amount when they are declared:

So instead of:

std::vector<QiVec3> tmpVerts;

I use:

QiArrayInplace<QiVec3, 128> tmpVerts;

Which means I reserve memory for 128 verts on the stack before any dynamic allocation happen. Anyway, I decided to try a small object allocator for the TinyXML and LUA allocations and my choice fell on the simplest possible solution - a fixed memory area for small allocations.

The idea is to reserve a fixed amount of memory at application startup, say 2 Mb, devide it into sections of different sizes. In Granny Smith I ended up using the following sizes: 8, 16, 32, 48, 64, 96 and 128 bytes, but they can of course be chosen to match the most common object sizes etc. The number of slots for each size is based on real statistics from the game. In my case, the 48 byte allocation was the most common, so almost half of the preallocated space is reserved for that size.

Now, for allocating out of this pool I use a packed bitfield for each allocation size to track which slots are in use. I also track the lowest free slot for each size. In the worst case scenario the whole bitfield needs to be searched. This is a linear operation, but in practice it is very cheap - one can search 32 slots per instruction (actually 64 nowadays) and it's a contiguous memory block, so very cache-friendly. I use 12.800 slots for 48 byte allocations, which translates to 400 32-bit integer comparisons all lined up in memory. This is the worst case scenario. On average I typically only need to search a few bytes. Deallocation is basically fo free. The pointer adress reveals if it's a small object allocation (if the pointer lies within the preallocated memory block) and also what size the allocation is (all allocations of the same size are grouped together). The actual deallocation is done by clearing the corresponding bit and potentially update the lowest free slot.

The most obvious win here is not speed, but low memory overhead. It uses only one bit of memory overhead per allocation (plus potential padding bytes up to the smallest matching size), so doing 1.000 allocations of 8 byte each will require exactly 8.125 bytes of memory. Performance is great, but there might be other allocators out there that are even better. Since it is all allocated out of a contiguous memory block it is cache friendly and does not cause any fragmentation.

It all sounds great, but there is of course one huge downside to it - when you run out of slots your're out. The allocator can only be made this efficient by using a single preallocated memory block, so you need to know the memory requirements at load time. However, when running out of slots one can just start to use the regular malloc instead, also for small objects. These will of course have the regular overhead, but at least it "fails" gracefully. I'm really happy with the results so far. Levels load very noticably faster and the overall memory usage went down. For games in particular, where the memory allocation pattern and total memory usage is known at compile time, I think this is a pretty good choice for general purpose small object allocator.

Tuesday, December 4, 2012

Programmers are people


Something that annoyed me for many years now is how the games industry (well, software development community in general) treat individual developers as "resources" in a project schedule. It doesn't just sound awful and is a direct insult to the profession, it's also plain wrong, stupid and counter-productive.

Yeah, I know what you're thinking.. but yes I would argue programming is a creative line of work, you are designing and building something. Do you think a car manufacturer talks about "designer resources" when coming up with a new model? Does a TV show producer have "screenplay writing resources" (well maybe they do, I don't have a clue). Of course it matters who designs and builds something. You can throw hundreds of bad composers on a soundtrack, but it still sounds horrible.

How many weeks does task X take to complete? The answer should always be the same - it depends on who does it. There are no man-hours, man-weeks or man-months in this industry. Building software is not like building a house. There are always tons of vastly different solutions to a given task. There might even be a tool out there that already solved the problem. Maybe the task itself is an effect of a previous design mistake?

When loading boxes of oranges a good worker might be twice as fast a bad worker. An awesome worker might be three times faster, but that's really pushing it. In software a highly motivated, awesome programmer can easily replace a whole room full of mediocre (not Mediocre!) programmers and still produce way better results, fewer errors, more maintainable code at a fraction of the cost.

Finding such talented programmers is of course very hard, but I wouldn't say they are quite as rare as most people think - keeping an awesome programmer motivated is the bigger challenge. The magic only happens when a great programmer is given freedom and responsibility. Unfortunately very few people seem to have realized this - the typical scenario being a big company headhunting some of the best programmers in the world and pay them tons of money to sit in a cubicle and munch dull scrum tasks off a backlog. Not so awesome anymore? Seriously, what did you expect? If you treat great programmers as generic programming resources they will become just that.

Thursday, October 25, 2012

How Granny Got the Look


In my last post I mentioned briefly how all graphical objects in Granny Smith are made out of 2D polygons which are transformed into 3D at load time. It was never initially designed for the sometimes complex environments in the game, but we decided to stick to this method instead of involving a separate 3D modelling software. I think, at times real 3D objects could have come in handy, but overall the current workflow is preferable since it's much more efficient. There is no need to track separate files or assets - every level is totally self-contained. Because the 2D data is so small, we don't even use instancing, so there is no risk of trashing another level when altering objects.

This is how a factor level looks in the editor.


The most fundamental transform is a simple extrude, but we can also apply a chamfer or fillet in the process. This is used extensively, especially for round hills and other natural shapes in the game. This beveling is done by gradually shrinking the polygon while extruding it, in one or more iterations.

Shrinking or expanding a polygon is not as easy as it sounds. For "well-behaved" polygons there is no problem, but sometimes vertices disappear in tight corners, or new ones must be added. I tried several different ways of implementing this myself, but it's really hard to find a robust method that never fails, or even one that failes gracefully. After a few failed attempts, I found the awesome Clipper library, which can do all kinds of polygon operations, including expanding and shrinking. It's reasonably fast, very robust and super-easy to use, I highly recommend. Even when using the Clipper library it was not trivial to get the beveling to work correctly. The Clipper library does not track or correlate vertices, so after a shrink operation you have no idea how the new vertices correlate to the old ones, hence it is really hard to stitch the two polygons together with triangles. It took me a few tries to implement a robust stitching algorithm, but I finally came up with one that deals with all well-behaved polygons, and most (but not all) tricky corner cases.




Shadows also deserves a mention. I started experimenting with different methods for smooth soft shadows very early on. The traditional way of using precomputed low resolution shadow maps didn't quite fit our needs, because all geometry is generated on the fly, and levels can be quite large. Since most geometry in Granny Smith is front-facing, extruded, flat, 2D polygons I came up with a scheme where the shadows are semi-transparent triangles based on projecting the 2D polygons onto each other. Thanks to Clipper, I already had a great toolbox for this. Each polygon is expanded and clipped against overlapping polygons in the background. The resulting "shadow" polygon then uses vertex coloring to smooth out the penumbra and a special shadow shader to achieve quadratic fall-off. All these shadow triangles are put in separate vertex buffers and rendered simultaneously. The engine supports dynamic soft shadows as well, but it's rather expensive, because all the expensive geometry clipping needs to happen every frame, so it's only being used in a very few places in the game. Because shadows are computed from the 2D polygons, there will only be shadows in the XY plane. To somewhat overcome this I also added self-shadowing along the extruded surface using the same vertex coloring scheme, so creases get a darker tint.
The anatomy of Granny Smith

The characters are basically a composite of 2D sprites, but slightly rotated to compensate for the camera angle, so it's somewhere in between an oriented sprite and a billboard. There was never a discussion about using "real" 3D characters for this game. My personal opinion is that 2D character are highly underrated for this type of game. 2D characters obviously yield better performance, but I'd also argue they also look better in many cases, especially on low resolution devices. Polygon aliasing completely disappear, because all edges are drawn into the alpha channel of the texture and rendered with mipmapping, so the characters always look crisp and sharp.


Characters is not the only area where we combine 2D graphics with 3D objects. Grass and foliage are two examples. The grass is added procedurally, while the decals for trees and bushes are placed manually along the rim of the object to hide sharp polygon edges and add extra detal. The grass is rendered with a special vertex shader to sway in the wind.


After completing a level, you get a replay in vignette and sepia color and with occasional bluring plus added dust and scratches to give the impression of an old movie. The vintage effect is just a shader effect in a post pass, but the replay itself features alternative camera angles and slow-motion effects which took some time to get right. The camera angles are determined by what's going on in the game. For example, it often switches to slow-motion close up right before fracturing an object. The replay data is basically just recorded input for the player, plus position correction data in case of divergence, similar to a networked multiplayer game, but I also record "special events" that are used to trigger camera angles. The replay data is analyzed and all the camera angles are decided before the replay starts. Choosing camera angles on the fly wouldn't work since you want to switch camera before the action happens. Getting everything in the game to support slow-motion playback without diverging too much was a real challenge, and you can still see artefacts here and there. The replay system is also driving the apple thief playback (in very subtle slow-motion) during regular gameplay.