<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-779328462175891131</id><updated>2012-01-27T09:49:27.762-08:00</updated><title type='text'>Game physics bits</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>22</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-8440677623245581873</id><published>2011-11-28T07:23:00.000-08:00</published><updated>2011-11-28T09:45:39.385-08:00</updated><title type='text'>Sprinkle behind the scenes</title><content type='html'>&lt;div style="text-align: left;"&gt;This summer me and a friend released a physics puzzler for &lt;a href="http://itunes.apple.com/se/app/sprinkle-water-splashing-fire/id447791438?mt=8"&gt;iOS&lt;/a&gt; and &lt;a href="https://market.android.com/details?id=com.mediocre.sprinkle"&gt;Android&lt;/a&gt; based on fluid simulation. It started as a really small project almost a year ago, but grew along the way and has been really well received on both platforms.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Last year i posted &lt;a href="http://tuxedolabs.blogspot.com/2010/08/milkmans-mistake.html"&gt;movie&lt;/a&gt; &lt;a href="http://tuxedolabs.blogspot.com/2010/07/academic-frustration-and-running.html"&gt;clips&lt;/a&gt; from a fluid simulator I was working on, and the fluid in Sprinkle is basically the same algorithm, but in 2D and with lots of optimizations. The simulator is particle-based, but not traditional SPH. Instead, the pressure and velocity is solved with a regular "sequential impulse"-solver. It's quite a mindjob to work out the constraint formulation, but it's worth the effort, since you get unconditionally stable fluid that is reasonably fast and interacts seamlessly with rigid bodies.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;iframe width="560" height="315" src="http://www.youtube.com/embed/DO1lAEHW7sY" frameborder="0" allowfullscreen=""&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The most compoutationally intensive operation is neighbor finding. I'm using a pretty standard spatial hashing technique, with a twist. Each particle gets assigned which quadrant within the cell it belongs to, and a table is used to search neighboring cells, so only four cells need to be searched for each particle instead of nine. For this to work, the cell size needs to be at least four times the particle radius. I also do all neighbor finding on quantized 8-bit positions within each cell, and cell positions are also 8-bit quantized. This is to reduce data size and cache impact.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The fluid step does two iterations per frame to get the desired incompressibility, but neighbors are only computed once. There is also a maximum number of neighbors per particle, and all pair-wise information is stored directly in each particle, duplicated per pair, so it can be traversed linearly. Particle positions are stored in a separate list to minimize cache impact when updated.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For rigid bodies, the excellent &lt;a href="http://box2d.org/"&gt;Box2D&lt;/a&gt; engine is used, and was loosely integrated with the fluid. Box2D is also setup to do two velocity iterations per frame, but the rigid/fluid solver only kicks in once per frame, so the overall update goes something like: rigid -&amp;gt; rigid -&amp;gt; fluid -&amp;gt; fluid -&amp;gt; rigid/fluid. Ideally I should have integrated the fluid solver more tightly, so the update loop would be rigid -&amp;gt; fluid -&amp;gt; rigid/fluid -&amp;gt; rigid -&amp;gt; fluid -&amp;gt; rigid/fluid, but it turned out to be unnecessary for the type of scenarios we wanted to create.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There is a maximum of 600 particles simulated at the same time, so there is a life time on each particle of about 3.5 seconds, then they disappear and gets recycled. An important part of levels design was of course to create levels that drain naturally, and do not rely on pools of water.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img src="http://1.bp.blogspot.com/-ItN3ye32XtQ/TtOoGAFfyvI/AAAAAAAAAus/kiGTvEPL39s/s400/mzl.wdtnyvwl.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5680068376100063986" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 267px; " /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In addition to the 600 simulated particles, there is 240 simplified particles, which only follow ballistic trajectories until they hit something. These are rendered separately in a brighter color to look more like foam and splashes. I spawn these "decoration" particles based on certain criteria, for example when fluid is hitting something, or is moving upwards. Full collision detection is still being done on these particles, in very much the same way as the regular particles, but they are instantly removed as soon as they hit something. The particle collison detection is performed first on the fluid cells, using standard Box2D queries and then for each particle using custom methods. Each rigid body shape has a dual representation used only for fluid collisions, and internal edges between convexes in a concave shape are filtered out in a preprocessing step to avoid "edge bumps". This was really crucial for any level that includes a ramp or a slide, which are highly concave.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For rendering, a dynamic particle mesh is built on the cpu every frame and rendered by the GPU using a simple refraction shader. On Android, I use double-buffered dynamic VBOs to update the particle mesh, while on iOS it seems faster to just use a vertex array. I guess for these shared memory architectures most of the old graphics wisdom is no longer applicable. Particles are stretched, aligned, resized, colored, etc based on their physical properties, so it's quite sophisticated for being a particle renderer. If there is one thing that really stands out about Sprinkle I'd say it's how smooth the water looks, considering it's only 600 simulated particles.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The fluid simulation is done on a separate thread, and so is the particle mesh generation, so it scales quite well onto multiple cores. For Tegra 3, which is a quad-core architecture we also added a smoke simulation on a separate thread, but that will be another blog post.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Getting the game to run at 60 FPS on a mobile device was really quite a challenge. The 0.8-1.2 GHz ARM processors found in mobile devices today are indeed quite fast, but if you compare to a regular desktop PC, it's actually still quite far away. For a frame time of 16 ms on an iPad 1 I had to target about 1.5 ms on my desktop PC, so it's roughly a factor of 10x.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-8440677623245581873?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/8440677623245581873/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2011/11/sprinkle-behind-scenes.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8440677623245581873'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8440677623245581873'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2011/11/sprinkle-behind-scenes.html' title='Sprinkle behind the scenes'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/DO1lAEHW7sY/default.jpg' height='72' width='72'/><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-288094820306661300</id><published>2011-03-31T13:42:00.000-07:00</published><updated>2011-03-31T13:47:02.251-07:00</updated><title type='text'>Impressions of the green robot</title><content type='html'>&lt;div&gt;I've been working on a mobile, physics-based game over the last five months (I'll post stuff about the project very soon) and today I started toying around with Android and porting the game. I'm not really sure what to think yet honestly. Some things are better than iOS development and other things are quite annoying. I really appreciate having command line tools for everything, and the ability to login to the device and do maintenance. Especially the ability to login and run a shell command on the device via a command line tool on the host. That way you can run scripts on your development machine that coordinate things both on the device and on the host at the same time. Awesome!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When it comes to development tools I think command line tools are far superior to graphical user interace in most cases (except for debuggers). I'm pretty happy with visual studio, but it's probably because I've been more or less forced to use it every day for the last ten years. Nothing beats having good command line tools and the ability to script everything the way you want them.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Being dependent on Java definitely sucks. They have really tried to work around it in the latest releases of the NDK, but it's still there, and you really do notice it. A lot. For a game programmer who keeps all his stuff in C++ this is no worse than Apple's stupid fascination for Objective-C though. A couple of revisions away, the Android NDK will probably be pretty complete, while iOS will always be knee-deep in Objective-C, forcing us to create horrible wrappers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Android documentation is bad in the best case and non-existent everywhere else, and the whole development procedure is very far from streamlined with a gazillion tools and configuration files to tie everything together. Note though that I'm talking about writing native C++ apps using Open GL ES 2 here, not the Java crap you see in all the tutorials. (By the way, the NDK compiler did not support C++ exceptions until very recently. I talked about exactly this in my &lt;a href="http://tuxedolabs.blogspot.com/2011/03/general-wisdom.html"&gt;previous blog post&lt;/a&gt;)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Asset management is the part I like the least about Android so far. You throw your files in the asset folder, and it automatically gets compressed into a zipped bundle. Then you can stream access resources from this bundle using the NDK, but not quite the way you'd expect. On iOS this works beautifully by just translating the path into a location on the device and then you can use fopen, fseek or whatever you like. On Android the tools automatically compress stuff into the bundle based on the file suffix (oh please..), and there doesn't seem to be any way of accessing compressed data from the NDK unless you write your of virtual file system. Solution? Add a .mp3 suffix to all the files! Seriosly...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-288094820306661300?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/288094820306661300/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2011/03/impressions-of-green-robot.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/288094820306661300'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/288094820306661300'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2011/03/impressions-of-green-robot.html' title='Impressions of the green robot'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-3908851005712923762</id><published>2011-03-14T09:39:00.000-07:00</published><updated>2011-03-14T09:50:46.491-07:00</updated><title type='text'>General wisdom</title><content type='html'>&lt;div&gt;I'm following quite a few game programming blogs, and whenever there is a post about a lifehack or general wisdom that can help me simplify my work I'm all ears. So, I thought I'd share some of my own experiences:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Automate everything that can be automated.&lt;/b&gt; Especially project file generation. Editing project files is a real energy drainer, and even though IDE's are trying to make the process smooth, it never is. This becomes a big problem first when multiple platforms come into the picture. Personally I have a Python script that take a few input parameters, scans the source tree and outputs a nice project file for Visual Studio, or a makefile. You have to bite the sour apple every time Visual Studio changes it's project file format, but it's so worth it. I also have similar scripts for documentation, distribution and in some cases code generation. Writing the scripts take a while, but they can be reused, you get better at writing them every time you do it, and it's more fun than doing dull monkeywork over and over again.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Minimize external library dependencies.&lt;/b&gt; People are way too eager on including external libraries and middleware in their projects. I think it is very common that the usage of libraries and middleware end up costing way more than it would have done just writing the code yourself. Only include an external library to your project if it: 1) Solves one specific task extremely well. 2) Can be trusted doing that. 3) Puts you in control of all memory and IO operations. 4) Can easily be included as source code.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Keep everything in the same project.&lt;/b&gt; This ties into the last criteria for using external libraries above. I want all third party libraries to be part of the source tree. Not a dynamic library, not a static library, not even a separate project in Visual Studio, just plain soure code in a separate folder. This is important, because it simplifies cross-platform development, especially when automatically generating project files. It also completely takes away the problems with conflicting runtimes for static libraries, mismatching PDB's, etc. It's all going in the same binary anyway, just put your files in the same project and be done with it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Refactor code by first adding new and then remove old. &lt;/b&gt;I used to do it the other way around for a long time, ripping out what's ugly, leaving the code base in a broken state until the replacement code is in place. Yes, it sounds kind of obvious in retrospect, but it took me a long time to actually implement this behavior in practice. The only practical problem I've experienced is naming clashes. I usually add a suffix to the replacement code while developing and then remove once the original code is gone. As an example, if you want to replace your Vector3, create a new called Vector3New, and then gradually move your code base over to using Vector3New, while continuously testing, and when you're done, remove the original Vector3 and do a search/replace for Vector3New to Vector3.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Don't over-structure your code.&lt;/b&gt; This one is really hard. People often talk about code bases lacking structure, but I think it's a much worse and more common problem that a code base has inappropriate structure, or just too much of it. Consider this - given two implementation of some algorithm, where one is a couple of large messy functions in a single file and the other is fifteen files with a ton of inherited classes, abstract interfaces, visitors and decorators. Given none of them suits your current needs, which one would you rather refactor? My point is that you shouldn't try to structure something until you know all the requirements. Not to save time first building it, but because it's a pain in the ass to restructure something that already has structure. You can compare it to building a house. Would you rather start with a pile of building material or first disassemble an existing building? To me that's a no-brainer, even if the pile happens to be quite messy. Hence, never define an abstract interface with only one implementation, never write a manager that manages one object, etc. Just start out writing your desired functionality in the simplest possible way, &lt;i&gt;then&lt;/i&gt; structure it if and when there is a need for it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Stay away from modern C++ features and standard libraries.&lt;/b&gt; I've tried introducing bits and pieces from STL, boost, exceptions and RTTI throughout the years, but every time I do, something comes out and bites me right in the butt. Buggy implementation, compiler bugs, missing feaures, restrictions on memory alignment, etc. This is depressing and discouraging, but the sad truth we have to deal with. If you want your code to be truly portable without the hassle (not just in theory, but in practice) you'll have to stick to a very small subset of the C++ standard. In my experience it's better to just accept this and design for it rather than putting up a fight.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Use naming prefixes rather than namespaces.&lt;/b&gt; I was advocating namespaces for a long time, but now I've switched sides completely and use prefixes for everything. I kind of agree prefixes are ugly, but it has two obvious benefits that just makes it worth it. A) You can search your code base for &lt;i&gt;all&lt;/i&gt; instances of a particular class or function, and B) it makes forward declarations as easy as they should be. With namespaces, especially nested, forward declarations is just painful, to a point where you tend to not use them at all, leaving you with ridiculous build times. I usually don't even forward declare classes at the top any more, but rather inline them where needed, like: "class PfxSomeClass* myFunction(class PfxSomeOtherClass&amp;amp; param)". &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-3908851005712923762?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/3908851005712923762/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2011/03/general-wisdom.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/3908851005712923762'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/3908851005712923762'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2011/03/general-wisdom.html' title='General wisdom'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-5081510608005943304</id><published>2010-10-13T08:53:00.000-07:00</published><updated>2010-10-13T08:56:55.577-07:00</updated><title type='text'>Schedulicious</title><content type='html'>&lt;div&gt;I've run into a pretty interesting/annoying problem while working on parallization of the physics pipeline.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I started off with a pretty standard thread pool, but proper physics requires quite a bit of synchronization. Especially in the solver, you need several full barriers to coordinate the threads. This is not so bad when you run thousands of objects, because the overhead of the synchronization is dwarfed by the massive simulation, but what would be more impressive is to go from say 3 ms per frame down to 1 ms.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Ideally you want your physics engine, rendering pipeline, AI, etc, all data parallel and run them sequentially every frame. The more common scenario of running physics on a separate thread and rendering on a separate thread introduces two extra frames of lag, which combined with double or triple buffering on the graphics card, plus another frame delay in the display (at least some modern TV sets do this) gives you very sluggish reponse. A secondary benefit from data-parallelism is the better cache utilization. It's more efficient to process all physics at once, and keep it in the L2, rather than doing all subsystems simultaneously, and have them trash the cache for each other along the way. Anyway, that's why I want the physics pipe to be super freakin data-parallel with a good speedup even for moderate scenes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My problem with the standard synchronization primitives is not excessive locking. All tasks are pretty independent and data is locked very sparingly. My problem is the barrier sync, and especially getting the worker threads to start processing again after the sync. I use a standard semaphore which I release when there is a new job coming in. The worker threads then start munching jobs off a queue in any order they like. When there are no more jobs, they go to sleep again on the semaphore. Now, doing a barrier sync involves putting all worker threads to sleep, since I have to wait for the last job to finish, then I immediately feed in a new job and start processing again.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When a worker thread hits the semaphore, Windows first goes into kernel mode, then realizes that the thread should go to sleep, starts looking for other threads to run, and in most cases there won't be any, so it switches to the idle process, clocks down the core frequency, takes out a good book and starts preparing a cup of coffee. Then, one microsecond later (who knew!), the main thread releases the semaphore and the whole process is reversed. Well, at least this is what I think it does, but I'm no kernel expert...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, since I know exacly what needs to be computed I can tell from the beginning when this is going to happen. Which is every barrier sync, every iteration, every frame. That seems somewhat unnecessary. So I started ripping out synchronization code from the thread pool and replacing it with interlocked functions and spinlocks, having the workers spin until there is more work, but only during the simulation step, and then put them properly back to sleep when we're done with physics.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This new strategy works perfectly and gives a really nice speedup on small scenes. The one annoying problem is that if there is anything, really &lt;i&gt;anything&lt;/i&gt; else running on the machine, the Windows scheduler gives my process a mad penalty. Even if they only spin for a small fraction of a frame. For some reason it chooses to swap out the entire process for several time quanta, leaving me with a 60 ms hickup every second or so. No cookie for Windows scheduler! Two questions:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A) Why does this bother Windows? Even on very small simulations where all workers sleep most of the frame this happens. That's right, most of my threads sleep on a semaphore most of the time, but I still get a penalty.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;B) How the heck does Windows even know that I'm using a spinlock? Does modern hardware have some sort of spinlock detection layer? I can't imagine the scheduler will be so severely bothered by any process that happens run four threads simultaneously, especially not for such a short time.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Seriously, what's going on? The problem goes away if I do a Sleep(0) in the spinlock, but that kind of defeats the purpose. Even though Sleep(0) doesn't switch to the idle process, it's still a rather costly operation, and an open invitation to the OS to go do something else, which is &lt;b&gt;exactly&lt;/b&gt; what I'm trying to prevent in the first place! Sigh. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-5081510608005943304?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/5081510608005943304/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/10/schedulicious.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/5081510608005943304'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/5081510608005943304'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/10/schedulicious.html' title='Schedulicious'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-908790406958771684</id><published>2010-08-20T08:19:00.000-07:00</published><updated>2010-08-20T08:22:30.953-07:00</updated><title type='text'>Milkmans mistake</title><content type='html'>&lt;div&gt;After two posts full of boring text and graphs I feel like I have to give away a crowd pleaser. Again, not a realtime simulation. About 120.000 particles, running at almost one FPS.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/O9fVTaibxHM?fs=1&amp;amp;hl=en_US"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/O9fVTaibxHM?fs=1&amp;amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-908790406958771684?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/908790406958771684/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/08/milkmans-mistake.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/908790406958771684'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/908790406958771684'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/08/milkmans-mistake.html' title='Milkmans mistake'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-3996157744749708529</id><published>2010-08-18T04:45:00.001-07:00</published><updated>2010-08-18T05:03:22.887-07:00</updated><title type='text'>The life of an impulse</title><content type='html'>&lt;div&gt;For a long time I've wondered how impulses develop inside a rigid body solver. Mainly for trying to come up with a prediction of where the final magnitude would be as early as possible. I dumped the data for all impulses out to a CSV file and graphed it for a couple of simple test scenarios. Let's start with a small test case, a box pyramid of two layers (three boxes). All dirty tricks are switched off, so this is a plain vanilla solver. The Y axis is impulse magnitude for the normal impulses and the X axis is iterations.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://2.bp.blogspot.com/_bRDPJhc-wtE/TGvJH8KttnI/AAAAAAAAAts/F3LK214Dhi4/s1600/3-boxes.png"&gt;&lt;img src="http://2.bp.blogspot.com/_bRDPJhc-wtE/TGvJH8KttnI/AAAAAAAAAts/F3LK214Dhi4/s400/3-boxes.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5506716107637896818" style="cursor: pointer; width: 400px; height: 234px; " /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At a first glance, there seems to be no suprises here. Impulses form asymptotic curves towards their solution and there is not a lot of action after about twelve iterations. But look again, especially at the bright yellow curve, that seems to steadily decrease and form a non-horizontal asymptote, and what about the deviation around iteration eight and nine? Let's try a more complex exmple, with a box pyramid of five rows (15 boxes). There is around 90 contacts, and it would be too messy to show them all in the same graph, so I chose ten of them from the middle of the stack:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://3.bp.blogspot.com/_bRDPJhc-wtE/TGvJIdLEjaI/AAAAAAAAAt0/NUJia6QVqDY/s1600/5-boxes-acc.png"&gt;&lt;img src="http://3.bp.blogspot.com/_bRDPJhc-wtE/TGvJIdLEjaI/AAAAAAAAAt0/NUJia6QVqDY/s400/5-boxes-acc.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5506716116497763746" style="cursor: pointer; width: 400px; height: 213px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is quite interesting - most of the curves are asymtotic to a non-horizontal line. Even at 64 iterations they have not found a stable configuration. Most interesting is maybe contact 45, which first goes up and then steadily decreases, while most othe contacts increase. Somehow, the solver is shifting around the weight. Maybe just because it can. Another graph we could study is the relative velocity along the normal direction (v from my previous post) and how it changes while iterating:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_bRDPJhc-wtE/TGvJIpjRKII/AAAAAAAAAt8/TNKhtVSy6GA/s1600/5-rows-relvel-acc.png"&gt;&lt;img src="http://1.bp.blogspot.com/_bRDPJhc-wtE/TGvJIpjRKII/AAAAAAAAAt8/TNKhtVSy6GA/s400/5-rows-relvel-acc.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5506716119820478594" style="cursor: pointer; width: 400px; height: 213px; " /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This graph is from the same scenario, but it's not from the same run, so contact numbering might not match exactly. Interesting here is that all curves steadily decrease after just a few iterations and seem asymptotic to a horisontal line, so our odd graph fellow above doesn't actually mean that the velocity is shifting similarly. Most likely it is an effect of an overdetermined system (with four co-planar contact points, we can shift the impulses around a bit and still get the same result. Not that the solver really has a good reason to do this, but it seems to be doing it anyway).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I also tried a variant of the solver that doesn't clamp accumulated normal impulses. Hence, the delta impulse is just clamped to be positive at each iteration. This prevents the accumulated impulse to ever have a negative inclination:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://1.bp.blogspot.com/_bRDPJhc-wtE/TGvJI_9sKAI/AAAAAAAAAuE/jU9tuJlthfY/s1600/5-boxes-no-acc-3.png"&gt;&lt;img src="http://1.bp.blogspot.com/_bRDPJhc-wtE/TGvJI_9sKAI/AAAAAAAAAuE/jU9tuJlthfY/s400/5-boxes-no-acc-3.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5506716125836879874" style="cursor: pointer; width: 400px; height: 239px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is more what I expected. Each impulse is asymptotic to a horizontal line. Now if there only was a way to predict that asymptote and use it as initial guess, we would end up with a really stiff solver without having to keep track of contact points from the previous frame.  I'm personally quite sceptical that it can be done from just looking at the curves, but it might be worth a shot. What i find most interesting is how good visual results you can get from just terminating the whole proces after three or four iterations, considering how far off the curves are.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-3996157744749708529?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/3996157744749708529/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/08/life-of-impulse.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/3996157744749708529'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/3996157744749708529'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/08/life-of-impulse.html' title='The life of an impulse'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_bRDPJhc-wtE/TGvJH8KttnI/AAAAAAAAAts/F3LK214Dhi4/s72-c/3-boxes.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-4795080341517727006</id><published>2010-08-01T04:12:00.000-07:00</published><updated>2010-08-01T04:23:14.493-07:00</updated><title type='text'>Explaining the rigid body solver</title><content type='html'>&lt;div&gt;Following my last post about scientific papers not being written for engineers I will do an attempt at explaining a rigid body solver without equations:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Even though a rigid body scene may consists of hundreds of objects and thousands of contact points, a popular way to solve the problem is to solve each contact point in sequential order, one at a time. It sounds kind of lame, and compared to other methods it is, but if iterated a couple of times it gives really good results, and this is what most games are actually using, so let's focus on solving one contact without friction first:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So you have two objects and one contact point with a contact normal. Start by computing the velocity at the contact point for both objects and compute the difference between those vectors. Project that difference onto the contact normal (dot product). This is the contact's relative velocity along the contact normal and it indicates how much the objects are moving towards each other or away from each other at the contact point. Let's call this velocity v. If v is positive, the objects are moving away from each other and we're done. If v is negative we need to proceed onto computing and applying an impulse.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is the key computation in the solver. The ultimate question we want to answer is - how big of an impulse do we need to apply to make the objects stop moving towards each other? The direction of the impulse is going to be the contact normal (since there is no friction yet), so we're only looking for the magnitude - a scalar quantity. The velocity v that we just computed is also a scalar quantity. The impulse magnitude will be proportinonal to v - the more the objects are approaching each other, the bigger impulse we need to apply. Easy! So what we're really looking for is the correlation between those two (another scalar quantity).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's assume we're applying an impulse of magnitude 1.0 (unit impulse) and see how big of a velocity change that would cause. The beauty of linear systems is that everything scales.. well.. linearly, so if we know how much of a velocity change a unit impulse causes we can just compare it to the desired velocity change, v, and adjust it accordingly. Say a unit impulse would cause a velocity change of 0.25 and v is -0.5, then we just apply two unit impulses (magnitude 2.0) and we'll reach our target relative normal velocity (zero). So make a copy of the linear and angular velocities for the two objects, apply an equal but opposite unit impulse and measure the relative contact velocity again following the exact same procedure as described above. Now that you know how much velocity change a unit impulse causes, the final computation boils down to a simple division. That's it folks. Apply the impulse you just computed on both objects in opposite directions and they will not move towards each other any more at the contact point.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Friction can be handled in the exact same way as described above, but in the direction of tangential relative velocity at the contact point. Friction impulses need to be capped to the magnitude of the normal impulse scaled by the friction coefficient, so that if the normal impulse is 2.0 and the friction coefficient is 0.9 your maximum friction impulse is 1.8. After you compute the friction impulse magnitude using above method, if it's more than 1.8 just apply 1.8. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now to make the solver stable you need to go over all contacts several times, and there is also a method called "accumulated impulses" that improves accuracy a lot which is not covered here, but most importantly, you need to compensate for penetration. This is usually done in a very pragmatic way - if the objects are in penetration, adjust the target relative contact velocity so that it is not zero, but slightly positive (the more penetration the more positive). This means that after we're done solving, the objects will move slightly away from each other along the contact normal instead of not moving at all.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-4795080341517727006?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/4795080341517727006/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/08/explaining-rigid-body-solver.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4795080341517727006'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4795080341517727006'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/08/explaining-rigid-body-solver.html' title='Explaining the rigid body solver'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-1304682785593479715</id><published>2010-07-30T14:49:00.000-07:00</published><updated>2010-07-30T15:16:44.398-07:00</updated><title type='text'>Academic frustration and running liquids</title><content type='html'>&lt;div&gt;I experimented with fluids about five years ago, at Meqon. Back then I didn't really have a clue about fluid dynamics. I guess it's fair to say I still don't have a clue about fluid dynamics, but I did read a few papers lately on SPH.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's kind of remarkable really, how complicated all those papers are written just to land at something that is quite simple. Sometimes I get the feeling that the research community are intentionally making scientific papers look more complicated to boost their egos. I wish there were research papers written for engineers, not other researchers, where you could read stuff like "And when this little fellow gets squished, it kind of pushes outwards on all the neighboring particles, proportional to their respective distance and density." &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I don't like formulas. I can read them, but it's very unintuitive for me, and probably many other programmers. I tend to believe that 3D computer graphics programmers are especially prone to finding formulas unintuitive since we are trained to visualize complex things in our minds. It's easier to visualize a description in plain english than a mathematical formula. For me at least. Maybe I just suck at math :)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A good engineering resource for physics is &lt;a href="http://www.rowlhouse.co.uk/main.html"&gt;Danny Chapman&lt;/a&gt;'s rowlhouse. There is source code to his physics project, &lt;a href="http://www.rowlhouse.co.uk/jiglib/index.html"&gt;jiglib&lt;/a&gt;, including SPH simulation and also his comments and ideas.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Back in 2005 I couldn't simulate running liquids, but only goo and clay-like fluids. I realize now that it was related to my urge to express everything through pairwise interactions. One would believe that a particle-based fluid is just a bunch of point masses interacting, but in SPH, a particle represents a sample of the implicit field, not just a point mass.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, so I implemented fluids in my solver, first as plain SPH, but I found it difficult to tweak, quite unstable and it doesn't interact well with rigid bodies, so I was searching for a semi-implicit formulation instead. It was actually quite tricky to come up with the right formulation, but eventually I think I nailed it. It looks good, it's unconditionally stable and it interacts naturally with rigid bodies without any hacks. Since it's semi-implicit it does get squishy at low iteration counts, but hey - fluids are no more incompressible than rigid bodies, right?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/pWOIQCqGp0A&amp;amp;hl=en_US&amp;amp;fs=1"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/pWOIQCqGp0A&amp;amp;hl=en_US&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/WQNfOiUAiZM&amp;amp;hl=en_US&amp;amp;fs=1"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/WQNfOiUAiZM&amp;amp;hl=en_US&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Oh, and a note on performance - no they are &lt;b&gt;NOT&lt;/b&gt; real-time simulations. Quite far from it actually. The second one runs at maybe one fps. Surface generation takes a good amount of time, but the number one bottleneck is actually neighbor finding. I haven't looked into it that much, since it's madness doing this on a CPU nowadays anyway, but it's kind of frustrating that something sounding so simple is the most time consuming part. Finding the neighboring particles is usually three to four times as expensive as running the solver. What's even more frustrating is that it's probably mostly due to cache misses, so laying out the whole arsenal of SIMD tools is not going to help even the tiniest bit... the whole pipeline should scale well with cores though, so could someone please get that Larrabee going again?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-1304682785593479715?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/1304682785593479715/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/07/academic-frustration-and-running.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/1304682785593479715'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/1304682785593479715'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/07/academic-frustration-and-running.html' title='Academic frustration and running liquids'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-3926206559237517548</id><published>2010-06-05T16:15:00.000-07:00</published><updated>2010-06-05T16:28:23.351-07:00</updated><title type='text'>Practicing quadrupel windsor</title><content type='html'>&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/8EMztFzyjI4&amp;hl=en_US&amp;fs=1&amp;"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/8EMztFzyjI4&amp;hl=en_US&amp;fs=1&amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;Starting to get soft body stacking and interaction in place. Especially the friction between two soft bodies is a little tricky. This is not a fake rigid body interpolation, but a true particle representation, which can compress to a single point and then return to it's original shape. It's quite expensive to be honest, since it takes more than just the regular verlet loop to get this kind of stability, but demo runs at 60 FPS.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-3926206559237517548?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/3926206559237517548/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/06/practicing-quadrupel-windsor.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/3926206559237517548'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/3926206559237517548'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/06/practicing-quadrupel-windsor.html' title='Practicing quadrupel windsor'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-8089380876293382699</id><published>2010-05-27T21:32:00.000-07:00</published><updated>2010-05-27T21:34:13.494-07:00</updated><title type='text'>Mesh collisions</title><content type='html'>&lt;div&gt;This is usually where physics gets messy. It's all fun and games until someone suggests that maybe all objects are not perfectly convex, such as... the game level? There is convex decomposition, yes, but I'm not totally convinced that's a silver bullet. Convex decomposition is awfully complicated, and requires heavy preprocessing. It's definitely a good idea in many cases, but there will always be raw triangles. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Convex decomposition or not, you need some sort of mid-phase, finding which triangles/primitives collide with a specific object. A lot of work has been put into this, and I think most people today agree that a quantized, binary AABB tree is the ideal solution.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's interesting here is how people usually query these AABB trees. The output from the broad phase is a list of pairs with overlapping bounding volumes. These pairs are then processed one by one, and in the case of a triangle mesh, the mid-phase is engaged to find the relevant triangles/primitives. After that, the near phase finds the actual contacts. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What I would like to suggest is to query the whole dynamic scene against the AABB tree all at once. That is, instead of colliding the AABB tree with a single object (single AABB) multiple times, you collide it with another AABB tree, representing all moving objects. This is especially relevant if combined with a Dynamic Bounding Volume Tree broad phase, as suggested by Erwin Coumans. In this case, all moving objects are already in a dynamic AABB tree of their own. Objects tend to appear in clusters, so objects close together also tend to collide with the same triangles. Doing the mid-phase this way saves you from drilling down the compressed AABB tree multiple times, which gives an instant performance gain. The tree/tree traversal is a bit of a mind job at first, but the Bullet DBVT implementation is a really good reference. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As usual, I'm really lazy and not doing my side-by-side comparisons, but it might show up later.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-8089380876293382699?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/8089380876293382699/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/05/mesh-collisions.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8089380876293382699'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8089380876293382699'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/05/mesh-collisions.html' title='Mesh collisions'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-8542842882690037223</id><published>2010-05-10T06:40:00.000-07:00</published><updated>2010-05-10T06:44:29.069-07:00</updated><title type='text'>Movie time</title><content type='html'>It was a long time since I posted a movie, so I thought I'd record a little snippet of what I'm working on at the moment - mesh collisions. Without the fancy SSAO renderer it runs in 60 FPS. Stay tunes for details!&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/EGogPcDGTi0&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/EGogPcDGTi0&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-8542842882690037223?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/8542842882690037223/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/05/movie-time.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8542842882690037223'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8542842882690037223'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/05/movie-time.html' title='Movie time'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-7142707794591945365</id><published>2010-05-04T20:14:00.000-07:00</published><updated>2010-05-04T20:17:41.606-07:00</updated><title type='text'>STL</title><content type='html'>&lt;div&gt;I really like the idea of using standardized versions of simple data structures. Small snippets of code rewritten over and over again by thousands of people. I bet there are a handfull of programmers world-wide at this moment implementing a dynamic array or a hash table. As much as I like the idea, I hate the only standard out there - STL.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are several problems:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) The interface is horribly over-engineered and just &lt;b&gt;plain sucks&lt;/b&gt;. Even the simplest of simple operations turns out as three or for lines of code. &lt;i&gt;Really long ones&lt;/i&gt;. Say I want to remove an element from a vector. Instead of just offering a dead simple myVector.erase(element) I have to bother with iterators and find functions (of course defined in another include file), ending up with this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;std::vector&lt;type&gt;::iterator it = find(myVector.begin(), myVector.end(), element);&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;if (it != vec.end())&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt; &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;vec.erase(it);&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This does only remove the first occurance of "element", which is understandable. If you want to remove all of them you need to use a whole different method... Another option is to iterate through the vector myself and remove the desired elements, basically implementing the erase algorithm on the spot, which completely takes away the benefit of using a standardized vector in the first place. Just as a bonus, you need to take special care of what happens to the iterator as you erase the element it's pointing at. Sigh.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;2) It is not minimal, nor consequent. There are heaps of ways to do everything, and none of them seems more commonly used than any other, and the naming of operations vary between classes. Removing elements is a good example, which can be done with member functions pop_back, erase or remove (available on lists but not vectors). There are also the global algorithms remove and remove_if, which actually does not really remove anything but merely scrambles elements around a bit for easier removal later. Confusing aye?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;3) Because of its severe complexity it is not consistent across platforms. Hence, porting your code to another platform involves small tweaks. There are projects like stlport that makes life easier, but it kind of takes away the purpose of a standard if there is actually only one implementation that is portable.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;4) Debugging STL code makes you suicidal. You know exactly what I mean.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;5) It claims to be designed for performance, but only considers academic-style algorithmic complexity and not real-world performance which is often based more on memory allocations, locking and memory access patterns. Some implementations use pool allocators to speed things up, but it's still allocations. I personally use my own data structures which includes an initial template-specified storage as part of the container and only allocate memory if it grows beyond that initial size. I suppose STL could be configured to do similar things by a template wizard, but I wouldn't want to be the guy maintaining that mess.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-7142707794591945365?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/7142707794591945365/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/05/stl.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/7142707794591945365'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/7142707794591945365'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/05/stl.html' title='STL'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-8319580614482522319</id><published>2010-04-14T03:44:00.000-07:00</published><updated>2010-04-14T04:21:20.531-07:00</updated><title type='text'>Visualizing complex algorithms</title><content type='html'>&lt;div&gt;I have had this idea for a long time about some tool to follow the steps of a complicated algorithm in some way other than stepping through it in the debugger, trying to mentally visualize what's going and frenetically scribbling unrecognizable geometrical figures on envelopes and old receipts.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A simple visualization sandbox is useful for visualizing the result, but it's really hard to visualize intermediate steps. GJK is a good example of a complex, iterative algorithm that has a visual representation at every step. A typical demo application would visualize the resulting closest points, normal, etc, &lt;i&gt;but not the internal steps of the algorithm itself&lt;/i&gt;, so if something goes wrong, you're screwed. The typical way to deal with it is to write parts of the algorithm in an isolated visualization environment and then lift them over to the real implementation when they are "tested". However, for visualization code to be useful, it needs to stay in the actual implementation so when real-world problems start coming in, they can be tracked down quickly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To do this properly, you would need to step the algorithm from the outside - that is, make an extra API to control it and poll the relevant information from the outside. Such an API would likely be substantially larger than the original API itself, and it's definitely not something you would want to ship.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another approach would be to add visualization code to the algorithm implementation, but this does not tie in very nicely with modern rendering engines. Say you want to use OpenGL for rendering. It would be convenient to draw lines, points, text, etc and then wait for some input event before going any further. However, the way OGL (or D3D) works, you fill a back buffer and won't see it until the entire frame is drawn. There might be a way to go around this, but it would be hard to integrate into any other application. I was for a moment considering running the rendering loop in a separate thread and feed the visualization thread with commands from the algorithm thread and also include a pause command, that would block until the user wants to step further.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The solution I finally settled on was to use an event stream that is filled in by the algorithm implementation using macros and a visualization framework for rendering the stream. Hence, the algorithm runs all the way from beginning to end, recording visualization events along the way. There is a pause event which indicates that the visualization framework should wait for user input. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I also needed something for removing graphics that was previously added. Say you have the normal of a plane, and the plane is changing every iteration. One way to do that would be to start referencing all the events and add explicit remove events, but that would quickly get quite messy. Instead, I added push and pop commands, so a subset of the graphics can be visualized and then removed before going any further. My visualization app can walk both forwards and backwards in the recorded event stream, which makes it easy to follow what's going on. It turned out to be quite useful and it doesn't bloat the code too much, typically something like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;DEBUG_PRINT("Computing stuff");&lt;/div&gt;&lt;div&gt;for(int i=0; i&amp;lt;10; i++)&lt;/div&gt;&lt;div&gt;{&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;...&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;computePoint(p);&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;computeNormal(n);&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;DEBUG_PUSH();&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;DEBUG_PRINT("Point and normal at iteration " + i);&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;DEBUG_POINT(p);&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;DEBUG_LINE(p, p+n);&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;DEBUG_PAUSE();&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;DEBUG_POP();&lt;/div&gt;&lt;div&gt;}&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This has proven to be really useful for debugging GJK, EPA, MPR, etc. I use automatic unit testing on distance queries, save the state for failing configurations and load them up in the debugger to manually inspect what went wrong. It gives a pretty clear view on various precision problems, termination critera, etc that would be very hard to get an intuitive understanding of otherwise. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_bRDPJhc-wtE/S8WfWsFO9kI/AAAAAAAAAqM/GCfagcxfN24/s1600/Screen+shot+2010-04-14+at+10.43.44+PM.png"&gt;&lt;img src="http://4.bp.blogspot.com/_bRDPJhc-wtE/S8WfWsFO9kI/AAAAAAAAAqM/GCfagcxfN24/s320/Screen+shot+2010-04-14+at+10.43.44+PM.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5459945335395644994" style="cursor: pointer; width: 320px; height: 221px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;My visualization app in the middle of a penetration depth calculation.&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The downside of my approach is that you can't combine it with stepping through a regular debugger. It would be quite nifty to see the graphical representation updating while stepping through the code in the debugger. I guess I could pass the event stream over a network socket to a different process, but that sounds just a tad too ambitious for my needs at the moment :)&lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-8319580614482522319?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/8319580614482522319/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/04/visualizing-complex-algorithms.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8319580614482522319'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8319580614482522319'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/04/visualizing-complex-algorithms.html' title='Visualizing complex algorithms'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_bRDPJhc-wtE/S8WfWsFO9kI/AAAAAAAAAqM/GCfagcxfN24/s72-c/Screen+shot+2010-04-14+at+10.43.44+PM.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-4816587705288024605</id><published>2010-03-30T00:09:00.000-07:00</published><updated>2010-03-30T04:28:06.989-07:00</updated><title type='text'>Thoughts on shape casting</title><content type='html'>&lt;div&gt;I implemented a different type of shape casting yesterday. The most common way, to my knowledge at least, is to do it with conservative advancement:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) Run the GJK algorithm to find closest point to the origin in configuration space&lt;/div&gt;&lt;div&gt;2) Use the distance and angle to compute a safe estimate to move (cast) the configuration space object towards the origin, in the cast direction.&lt;/div&gt;&lt;div&gt;3) Repeat until below a certain threshold, or the direction to the origin is pointing away from the cast direction (in which case it is a miss).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Even though this is an iterative method, relying of GJK, which in itself is iterative, this is actually pretty fast, because it converges quickly, like two or three iterations and you can benefit from warmstarting GJK with the simplex from the previous iteration.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, there is a completely different method, which I thought I came up with, but after googling it for a while it seems Jay Stelly has mentioned similar methods before across the &lt;a href="http://bulletphysics.org/Bullet/phpBB3/viewtopic.php?f=9&amp;amp;t=1244&amp;amp;view=previous"&gt;Bullet&lt;/a&gt; and &lt;a href="https://mollyrocket.com/forums/viewtopic.php?p=3628"&gt;MollyRocket&lt;/a&gt; forums. The method uses MPR on a swept CSO. The idea is to sweep the CSO in the direction of the cast far enough to enclose the origin, and then raycast from the origin to find the distance to the surface.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) Construct the swept CSO. This may sound complicated, but since support functions are additive it's as easy as adding a vector to the existing support function. The resulting shape will be a swept version of the original CSO. The direction of the vector should be the sweep direction, and the length should be "long enough" to pass the origin. As showed later on, the distance here doesn't really matter much, so just keep it reasonable to avoid numerical issues. I use the distance from the geometric center of the CSO to the origin. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;2) Use &lt;a href="http://en.wikipedia.org/wiki/Minkowski_Portal_Refinement"&gt;MPR&lt;/a&gt;, or &lt;a href="http://xenocollide.snethen.com/"&gt;XenoCollide&lt;/a&gt;, to find the distance from the origin to the surface of the swept CSO, in the sweep direction. If the origin is not inside the CSO, we know there is a miss.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;3) Subtract the distance found in step 2) from the distance computed in step 1).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;4) That's it! The difference in step 3 is the final sweep distance.  Imagine a line going from the original CSO through the origin to the point computed using MPR in step 2). The line will have the same direction as the sweep and the origin will divide it into two parts, the first part, from the original CSO to the origin, and the second part from the origin to the surface of the swept CSO. You know the total length of the line (determined in step 1) and you know the length of the second part of the line (computed in step 3), so therefore the first part must be the difference. I know, I know, a picture would be ideal, but I'm lazy. Bug me about it and I might post one :)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Although I haven't done any side-by-side comparisons yet, it seems to me much more efficient to do this second approach, since it is much simpler, less prone to numerical precision issues and (yet to be proved) faster. So if this is a known method, how come everybody is still talking about conservative advancement?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[EDIT]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I quickly profiled the new method against the old one (conservative advancement). Each session is 10 000 sweep queries with different angles, computing closest points, sweep distance and contact normal.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Box - Box (old)&lt;/div&gt;&lt;div&gt;Min 10 ms&lt;/div&gt;&lt;div&gt;Avg 16 ms&lt;/div&gt;&lt;div&gt;Max 30 ms&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Box - Box (new)&lt;/div&gt;&lt;div&gt;Min 11 ms&lt;/div&gt;&lt;div&gt;Avg 14 ms&lt;/div&gt;&lt;div&gt;Max 16 ms&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;Cylinder - Cylinder (old)&lt;/div&gt;&lt;div&gt;Min 14 ms&lt;/div&gt;&lt;div&gt;Avg 40 ms&lt;/div&gt;&lt;div&gt;Max 99 ms&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Cylinder - Cylinder (new)&lt;/div&gt;&lt;div&gt;Min 20 ms&lt;/div&gt;&lt;div&gt;Avg 35 ms&lt;/div&gt;&lt;div&gt;Max 53 ms&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Polyhedron - Polyhedron 128 verts (old)&lt;/div&gt;&lt;div&gt;Min 99 ms&lt;/div&gt;&lt;div&gt;Avg 130 ms&lt;/div&gt;&lt;div&gt;Max 252 ms&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Polyhedron - Polyhedron 128 verts (new)&lt;/div&gt;&lt;div&gt;Min 98 ms&lt;/div&gt;&lt;div&gt;Avg 130 ms&lt;/div&gt;&lt;div&gt;Max 185 ms&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It turns out that conservative advancement is slightly faster in best case, slightly slower on average, but much slower in worst case, which makes sense, since it is doubly iterative. I have not noticed any differences in robustness so far, but I haven't stressed it either.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-4816587705288024605?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/4816587705288024605/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/03/thoughts-on-shape-casting.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4816587705288024605'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4816587705288024605'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/03/thoughts-on-shape-casting.html' title='Thoughts on shape casting'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-4824512284472605355</id><published>2010-03-21T13:51:00.000-07:00</published><updated>2010-03-21T13:54:47.563-07:00</updated><title type='text'>Greetings from the inside!</title><content type='html'>&lt;div&gt;So after spending a few weeks outside the CSO, I got around to step inside and experiment with a number of different penetration distance algorithms this week. However, I failed to come up with an alteriative to EPA that can actually guarantee a global minimum separation. These are the ones I tried:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) Use MPR to raycast the inside of the CSO in a direction estimated by the geometric center of the CSO, use the resulting normal and iterate until convergence. This works in 95% of all cases and is really fast. The downside is that there is no way to guarantee or measure if this really is the global minimum.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;2) Same method as in 1) but use multiple search directions. I was for a while under the impression that two local minima had to be at least 90 degrees apart, so casting six rays should be safe (of which three can be optimized away). This works in 99% of all cases, but there is still that annoying one percent where it finds a local minima.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;3) Same method as in 1) but trace the edges of the last portal. After the initial surface point has been found, tilt the search direction in the direction of each edge, so that it falls outside the last portal, choose the best direction and iterate until convergence. This way we can sample the neighboring triangles and search those directions. Again, there is a 99% success rate. Unfortunately in some cases all neighboring triangles will lean towards the initial triangle, so we will just end up at the same local minima again.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;4) Broken EPA. I tried using EPA without tracking connectivity and without keeping the polytope convex. Simply a triangle soup, where the closest triangle is split and valid triangles (the ones that are portals) are kept. It doesn't work.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;5) Full EPA. I used the one found in Bullet as inspiration and it works really well in all cases. I'm really impressed by the Bullet implementation by Nathanael Presson. EPA is an awfully complicated algorithm to implement, but it seems very stable and quite fast.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So all this fuzz to come up with the conclusion that EPA is the way to go? I'm still contemplating wether there is a way to combine MPR and EPA to make an early out and just find the surface using MPR. It might not be possible.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm still not using EPA for all penetrations. In most cases using MPR with the last known separation axis works excellent, even just sampling and projecting support mappings on the last axis works really well. Physics is surprisingly forgiving about something, for once.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-4824512284472605355?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/4824512284472605355/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/03/greetings-from-inside.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4824512284472605355'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4824512284472605355'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/03/greetings-from-inside.html' title='Greetings from the inside!'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-8829639391886235773</id><published>2010-03-16T13:29:00.001-07:00</published><updated>2010-03-16T13:52:11.611-07:00</updated><title type='text'>Trapped in configuration space</title><content type='html'>I've spent the last couple of weeks tinkering with GJK and ended up rewriting it from scratch. Ever since I saw &lt;a href="http://mollyrocket.com/136"&gt;Casey&lt;/a&gt;'s &lt;a href="http://mollyrocket.com/849"&gt;video&lt;/a&gt; about SAT-GJK a couple of years ago I've wanted to try it, so I finally did. Now Casey's video only deals with binary overlap status, not closest points, but it was relatively easily to extend his approach and even do a linear sweep operation. I especially like how his method does not include any divisions or square roots, which makes it very robust. I must add though that it's not straightforward to get a robust implementation that works on both polyhedra and quadric objects. Especially the termination criteria needed a lot of attention.&lt;br /&gt;&lt;br /&gt;I knew from the beginning that decent visualization tools would be required to pull off a new GJK implementation, and it's certainly not trivial to visualize all the information needed. There are the two objects, the configuration space object, the simplex which comes in four flavors, normals, closest points, etc, etc. Here's a screenshot from my test bed in the middle of a GJK distance calculation. The algorithm can be stepped and visualized at both the add and the reduce step of each iteration:&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_bRDPJhc-wtE/S5_rYGh4qqI/AAAAAAAAAoc/gBw8QbrjK84/s1600-h/Screen+shot+2010-03-16+at+2.06.16+PM.png"&gt;&lt;img src="http://2.bp.blogspot.com/_bRDPJhc-wtE/S5_rYGh4qqI/AAAAAAAAAoc/gBw8QbrjK84/s320/Screen+shot+2010-03-16+at+2.06.16+PM.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5449332873443715746" style="cursor: pointer; width: 320px; height: 238px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;On the left is the configuration space object, the red triangle is current simplex, with closest point and normal. The corresponding object space closest points are also visualized on the two objects. I could have spent even more time on visualization, and I might just do that, because next I want to try out a couple of ideas for penetration depth computation.&lt;br /&gt;&lt;br /&gt;The new implementation is both more robust and higher performance that the old one from Meqon one, which uses Johnson's distance algorithm from the original paper. I'm not sure how it compares to the Bullet implementation with geometric distance algorithm. Part of the performance increase is due to a more efficient caching scheme. To old version only supported an initial 1-simplex, which was the previous frame's closest points, while the new uses the whole previous simplex, which is a bit more complicated, since it can deform badly due to motion, but it really pays off, and most of the times it will terminate immediately after the first iteration.&lt;br /&gt;&lt;br /&gt;Here are some performance comparisons:&lt;br /&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;Scene with 16-vert polyhedra&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;Old: 22.5 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;New: 17.1 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt;&lt;span class="Apple-style-span" style="white-space: normal; "&gt;New+caching: 9.3 ms&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre;"&gt;Scene with cylinders&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;Old: 2.6 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt;New: &lt;/span&gt;2.5 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;New+caching: 1.7 ms&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;Scene with boxes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;Old: 10.7 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;New: 8.8 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;New+caching: 6.3 ms&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;Scene with mixed objects&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;Old: 19.3 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;New: 13.9 ms&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;New+caching: 6.9 ms&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;So, depending on the scenario, and how much motion there is, the new version ranges from slightly faster to almost three times faster. More importantly though, is that the new version is more numerically robust.&lt;br /&gt;&lt;br /&gt;I find it quite fascinating how Casey's approach with caching is very similar to the old Lin-Canny closest feature algorithm. Both methods are based on voronoi regions, but Casey's operates in configuration space, while Lin-Canny uses object space. Casey's approach uses implicit voronoi regions determined by support functions, while Lin-Canny uses explicit pre-computed voronoi regions, but the underlying principle, that the closest distance is always found within the voronoi regions of a particular feature pair is the same.&lt;br /&gt;&lt;br /&gt;What currently keeps me awake at night is if there might be an implicit configuration space counterpart to V-Clip, which would then work for penetrating objects as well? The only implicit penetration depth method I'm aware of that has been proven to work is EPA, but it's so awkward I'm not even going to try implementing it. Intuitively I feel like there &lt;i&gt;must&lt;/i&gt; be a simpler way to compute exact penetration depth that doesn't involve dynamic data structures.&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-8829639391886235773?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/8829639391886235773/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/03/ive-spent-last-couple-of-weeks.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8829639391886235773'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8829639391886235773'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/03/ive-spent-last-couple-of-weeks.html' title='Trapped in configuration space'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_bRDPJhc-wtE/S5_rYGh4qqI/AAAAAAAAAoc/gBw8QbrjK84/s72-c/Screen+shot+2010-03-16+at+2.06.16+PM.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-292800566759904454</id><published>2010-02-09T22:41:00.000-08:00</published><updated>2010-02-09T23:34:10.049-08:00</updated><title type='text'>Oh how hard can it be</title><content type='html'>Funny, I'm falling into my own how-hard-can-it-be convex hull trap I described just three posts ago! It turns out my idea for a support mapping-based convex hull generator isn't that great after all. It works perfectly in 2D, but in 3D it turns out the surface can get concave during the process, which of course will mess up the whole thing, so just forget everything I ever said about convex hull now would you (except perhaps the part about it always being harder than you think)? Thanks. Back to the drawing board. There must be some way to compute convex hulls using only the support direction..&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-292800566759904454?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/292800566759904454/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2010/02/oh-how-hard-can-it-be.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/292800566759904454'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/292800566759904454'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2010/02/oh-how-hard-can-it-be.html' title='Oh how hard can it be'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-5266140549008380814</id><published>2009-12-15T11:21:00.000-08:00</published><updated>2009-12-16T14:07:37.891-08:00</updated><title type='text'>More hull</title><content type='html'>&lt;div&gt;I realized I never wrote anything about the generic convex object visualization I mentioned earlier. It actually turned out really good, and the algorithm is very simple:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) Get the support point for each of the positive and negative coordinate axes.&lt;/div&gt;&lt;div&gt;2) Build an octahedron with corners at the six points (like a pyramid pointing up, on top of a pyramid pointing down. This is the starting shape.&lt;/div&gt;&lt;div&gt;3) For each triangle in the current shape:&lt;/div&gt;&lt;div&gt;Get support point in outwards normal direction.&lt;/div&gt;&lt;div&gt;If support point does not equal (within margin) any of the three corners,&lt;/div&gt;&lt;div&gt;split the triangle into three new ones, using the new point.&lt;/div&gt;&lt;div&gt;4) Repeat step three until convergence, or maximum number of iterations.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Splitting the triangle into three is not ideal since it encourages long thin triangles and a rather uneven triangle distribution, but in practice it works fairly well. I guess one could look into Loop-style edge splitting instead, but it would require more bookkeeping.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I add a small sphere to each support function to get the slick round edges virtually for free!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_bRDPJhc-wtE/SyfkqYE2a0I/AAAAAAAAAlk/sKDSVZtOrRU/s1600-h/1.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_bRDPJhc-wtE/SyfkqYE2a0I/AAAAAAAAAlk/sKDSVZtOrRU/s320/1.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5415548493604744002" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I think I'm gonna have to take back what I said about the performance of stanhull. It's actually really slow and can somtimes take several milliseconds for just a few hundred points. I'm going to try and write my own convex hull generator based on the visualization algorithm above, but with one extra step - after each triangle split, prune vertices that are inside the newly formed tetrahedron. It would not be able to generate an optimal hull without reduction, but it would be guaranteed to have valid topology and connectivity. It would also have an almost trivial implementation and perform really well.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-5266140549008380814?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/5266140549008380814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2009/12/more-hull.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/5266140549008380814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/5266140549008380814'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2009/12/more-hull.html' title='More hull'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_bRDPJhc-wtE/SyfkqYE2a0I/AAAAAAAAAlk/sKDSVZtOrRU/s72-c/1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-8142190824256402190</id><published>2009-12-09T18:15:00.000-08:00</published><updated>2009-12-09T19:10:16.370-08:00</updated><title type='text'>Big or beautiful</title><content type='html'>&lt;div&gt;Here is a capture from a couple of simple demos. I've made a number of improvements to the solver lately and also now gathering multiple contact points in the initial manifold. Incremental manifold is such a neat idea, but it does affect stacking stability, and objects tend to penetrate more, which in turn affects both performance and behavior. I'm now using relative rotations and reiterating GJK to find multiple points, but I'm experimenting with other methods that should be both cheaper and more stable.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;All demos use a 60 Hz update and four solver iterations. Various shock propagation methods in the solver remove a lot of the sponginess you normally see at four iterations. I strongly believe in elegant algorithms, rather than hand-optimized brute force code. Hence, I think it's better to have a rather sophisticated solver that do more work per iteration rather than just increasing the number of iterations. As you can see in the demos, even a quite large stack is pretty stiff.&lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/UwsyxMQJZKs&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/UwsyxMQJZKs&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Speaking about demos, I am SO tired of physics demos running in slow-motion. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) Most programmers know that taking smaller time steps increases general stability. Objects move less per time step, hence making things easier to compute.&lt;/div&gt;&lt;div&gt;2) Decreasing gravity improves stability, since stacks feel less weight of the objects above them.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Most people would agree that altering any of the above is cheating. People are generally very picky about keeping the time step realistic, and some are proud to use 9.82 as gravity instead of just 10. But hey, why do most physics demos run in slow-motion then? Because there's a third way to cheat that nobody cares about:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;3) Making objects larger is just as bad as decreasing gravity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Remember what Galileo taught us four hundred years ago: Large objects receive the &lt;i&gt;same&lt;/i&gt; gravitational acceleration as small objects. So, keeping the acceleration constant and increasing the size is equivalent to keeping the same size of the object and decreasing the gravity! Increasing the size of an object might seem quite innocent "Ho ho ho, my simulation is so stable. This box just happen to by six feet tall, who can blame &lt;i&gt;me&lt;/i&gt; for that?"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;How about simulating a pile of books? Or a couple of plates and wine glasses? At 60 Hz that's quite a challenge, and which objects are going to be more common in your game? So, if you evaluate a physics engine always make sure to do it in a realistic scale. The boxes in these demos are just above one foot along the side, which I think even that is questionable. A set of physics demos that really impressed me for their natural looking motion is the Tokamak Physics Engine.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-8142190824256402190?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/8142190824256402190/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2009/12/big-or-beautiful.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8142190824256402190'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/8142190824256402190'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2009/12/big-or-beautiful.html' title='Big or beautiful'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-4546346326459201640</id><published>2009-12-01T22:30:00.000-08:00</published><updated>2009-12-01T23:27:03.449-08:00</updated><title type='text'>Full hull</title><content type='html'>Today I was scouting around for a decent convex hull generator. It's quite surprising to me that such a clearly defined task still doesn't have a de facto standard implementation that everybody uses. I just wanted one for computing the visualization meshes, so performance is not even critical. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I looked into a couple of different ones that had clean interfaces. &lt;a href="http://student.ulb.ac.be/~claugero/3dch/index.html"&gt;This one&lt;/a&gt; looked promising due to its simplicity, but after trying just a few point clouds it crashed on me. Convex hull generation is tricky that way &lt;i&gt;"Oh, how hard can it be, just split that triangle, merge those two, test for that plane, blah blah"&lt;/i&gt; and then when you sit down and implement it there are all these degenerate cases, and you can't really make assumptions about &lt;i&gt;anything&lt;/i&gt;. Our convex hull generator at Meqon was a constant pain in the ass for three years. I don't think we ever got it right. Qhull seems to be very robust, but a) it's a huge mess, b) it has a ton of features you don't want and c) it has a somewhat questionable license.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;James Dolan kindly pointed me to an implementation written by my former colleague Stan Melax for a physics exporter. John Ratcliff then wrapped it up, gave it the name "stanhull" and posted it on his &lt;a href="http://codesuppository.blogspot.com/2006/03/john-ratcliffs-code-suppository-blog.html"&gt;code suppository&lt;/a&gt;. It took me five minutes to integrate, it works like a charm, it has a great interface, no dependencies and it's &lt;i&gt;fast. &lt;/i&gt;Look no more and cheers to Stan.&lt;/div&gt;&lt;div&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;/div&gt;&lt;div&gt;I have also contemplated to do an automatic visualization of implicit convex hulls using only the support function. Not that it would be super-useful, but still pretty cool to use the same code for visualizing all types of convex objects. A naive approach would be to sample the support function in a ton of different directions and just build a hull of the resulting points, but I've been thinking about a more intelligent subdivision method to avoid sampling crazy many directions and still not miss corners. Might try it tomorrow.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-4546346326459201640?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/4546346326459201640/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2009/12/full-hull.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4546346326459201640'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4546346326459201640'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2009/12/full-hull.html' title='Full hull'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-4067247859009779656</id><published>2009-11-29T17:13:00.000-08:00</published><updated>2009-11-29T19:44:44.590-08:00</updated><title type='text'>Putting on the shades</title><content type='html'>For rendering, I wanted to go one step (but not more than one step) beyond the flat-shaded polygons that are typical for physics demos. I wanted shadows, but I have never been a big fan of sharp edges, such as with shadow volumes or even shadow maps. Soft shadows is really what enables a whole new level of realism. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I worked on an ambient occlusion project for Ageia a couple of years ago, just before they got acquired by Nvidia and I think that's a really good alternative to real global illumination. My approach at the time was to compute ambient oclcusion in 3D on the second generation PhysX hardware (that was never released) and dynamically update low-res light maps for the parts of the scene that changed. It had &lt;a href="http://www.youtube.com/watch?v=Qw7nT8xmGUE"&gt;good potential&lt;/a&gt;, but shortly after Crysis was released and the screen-space methods (SSAO) started taking off. I've always been curious about SSAO ever since but never got around to implement one, so I thought this was a good opportunity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I found &lt;a href="http://www.shalinor.com/research.html"&gt;this&lt;/a&gt; article, which is a really good introduction. I didn't end up using exactly that method, but it's quite similar. Vertex position and normals are rendered to an FBO, then the occlusion is computed to another FBO and the final image is rendering to the framebuffer using deferred lighting. During the final pass occlusion values are blurred, using only samples in the same plane. Ideally one would want to do this with a gauss kernel, but I only do it horizontally and vertically to save some shader cycles. I'm still on my three year old MacBook Pro with an ATI X1600... Additionally I'm also blurring the normals a tiny bit during the final pass to soften the edges.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's not an awful lot of time spent on the rendering part, but I'm quite happy with it for the time being. Here's a video demonstrating the scene with and without ambient occlusion.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/a4lQafsSWvA&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/a4lQafsSWvA&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-4067247859009779656?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/4067247859009779656/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2009/11/putting-on-shades.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4067247859009779656'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/4067247859009779656'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2009/11/putting-on-shades.html' title='Putting on the shades'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-779328462175891131.post-5719741420717896200</id><published>2009-11-29T15:13:00.000-08:00</published><updated>2009-11-29T15:25:04.371-08:00</updated><title type='text'>A new hope</title><content type='html'>So I finally did what I should have done a long time ago - I started coding physics again. All those ideas that were lingering in the back of my head in the Meqon and Ageia days but I never had time to try will finally materialize, be implemented and posted on this very blog, along with pictures, videos, demos and general ideas around physics engines.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So far I've focused on rigid body dynamics, and more specifically the contact manifold and solver. I'm a big fan of &lt;a href="http://en.wikipedia.org/wiki/Gilbert%E2%80%93Johnson%E2%80%93Keerthi_distance_algorithm"&gt;GJK&lt;/a&gt;, so there's no surprise it makes up the backbone of the collision detection engine.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The solver is of sequential impulse type with warmstarting (just like everybody else), but has a big bag of tricks that accumulated over the years. I put a lot of effort in friction this time, since I really wanted a realistic friction that does not creep, but actually sticks and comes to rest, even in non-trivial poses.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Allright, here's the obligatory box pyramid:&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_bRDPJhc-wtE/SxMChVvj_ZI/AAAAAAAAAlc/_mhrucPIy-8/s1600/Picture+1.png"&gt;&lt;img src="http://2.bp.blogspot.com/_bRDPJhc-wtE/SxMChVvj_ZI/AAAAAAAAAlc/_mhrucPIy-8/s320/Picture+1.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5409670349197671826" style="cursor: pointer; width: 320px; height: 239px; " /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/779328462175891131-5719741420717896200?l=tuxedolabs.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tuxedolabs.blogspot.com/feeds/5719741420717896200/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://tuxedolabs.blogspot.com/2009/11/new-hope.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/5719741420717896200'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/779328462175891131/posts/default/5719741420717896200'/><link rel='alternate' type='text/html' href='http://tuxedolabs.blogspot.com/2009/11/new-hope.html' title='A new hope'/><author><name>Dennis Gustafsson</name><uri>http://www.blogger.com/profile/07339235520248351574</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='29' src='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxL2rlUXJsI/AAAAAAAAAk4/dVl33GvELyY/S220/IMG_1466.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_bRDPJhc-wtE/SxMChVvj_ZI/AAAAAAAAAlc/_mhrucPIy-8/s72-c/Picture+1.png' height='72' width='72'/><thr:total>0</thr:total></entry></feed>
