View Issue Details

IDProjectCategoryView StatusLast Update
0003234The Dark ModCodingpublic25.06.2018 05:52
Reporternbohr1more Assigned Toduzenko  
PrioritynormalSeverityfeatureReproducibilityhave not tried
Status resolvedResolutionfixed 
Product VersionSVN 
Target VersionTDM 2.06Fixed in VersionTDM 2.06 
Summary0003234: Merge Mh's Optimized VBO Code
DescriptionQuake coder Mh has taken an initial look at modernizing Doom 3's Vertex Cache and VBO Code:

"What the glMapBufferRange stuff does is allow you to take advantage of a VBO streaming pattern that D3D has enjoyed since at least version 7 - in D3D terms it's known as the discard/no-overwrite pattern.

A VBO is a GPU resource, and normally, if you try to update a GPU resource that is currently in use for drawing with (entirely possible because of the asynchronous nature of CPU/GPU operation), everything must stall and wait for drawing to complete before the update can happen. The stock Doom 3 code actually double-buffers it's streaming VBOs to try avoid this (in a slightly obfuscated way) but glMapBufferRange is a more robust way.

So, I mentioned discard/no-overwrite above. Here's what they do.

The buffer is filled in a linear manner. You've got 2mb (or whatever) of space, vertexes are added beginning at position 0, as new vertexes are added they get appended until the buffer fills, then magic happens.

This standard update is no-overwrite; your code makes a promise to GL that it's not going to overwrite any region of the buffer that may be currently in use for drawing, and in return GL will let you update the buffer without blocking. In order to be able to keep this promise your code must maintain a counter indicating how much space in the buffer it has previously used, and add new verts to the buffer at this counter position.

When the buffer becomes full you "discard". This doesn't throw away anything previously added, instead GL will keep the previous block of buffer memory around for as long as is needed to satisfy any pending draw calls, but will give you a new, fresh block for any further updates. That's the "magic" I mentioned above, and it's what lets you use a streaming VBO without any blocking.

This pattern will also let you get rid of Doom 3's double buffering, thus saving you some GPU memory (I haven't yet done this in my code). Because there's no more blocking it will run faster in cases where there is a lot of dynamic buffer usage, but because Doom 3 locks at 60fps it may not be as directly measurable as if the engine was unlocked. Hence the "it feels more responsive but I can't quite put my finger on it" result.

There's another chunk of code in the standard Alloc call which deals with updates of non-streaming VBOs and which is implemented in quite an evil manner by the stock Doom 3 code. When updating such a VBO you can get a faster update if the glBufferData params are the same as was previously used for that VBO (the driver can just reuse the previous block of buffer memory instead of needing to fully reallocate). Doom 3 doesn't do that, so it doesn't get these faster updates, but by searching the free static headers list for a VBO that matches and using that instead of just taking the first one from it, it can. Obviously it sucks that you need to search the list in this way, and a better implementation would just store the VBO with the object that uses it, and reuse the same VBO each time. Since this mainly happens with model animations an ever better implementation would use transform feedback to animate the model instead of animating it on the CPU and needing to re-upload verts each frame, but I haven't even looked at that yet.

So all in all the stock VBO implementation is an unholy mess that needs serious work to get it functioning right, much the same way as Quake 1 lightmap updates were a mess. That code just represents the start of a process.."
Steps To Reproduce1) Replace VertexCache.cpp and VertexCache.h with the attached

Dropbox download

2) Compile
3) Test the results
TagsNo tags attached.


related to 0004849 resolvedcabalistic Replace VBO implementation with the one similar to D3BFG 
child of 0003684 new Investigate GPL Renderer Improvements 




24.09.2012 22:00

developer   ~0004850

Last edited: 24.09.2012 22:02

May require an updated glext.h

Here is a newer version that was compiled against Doom 3 that includes
the gl_arb_map_buffer_range extension.

Or just grab from



08.10.2012 20:02

reporter   ~0004897

I saw no positive FPS results from doing this, in fact, I lost about 4-5 FPS on average. (YMMV, different hardware can influence the results.)

I compared against:

[Original.exe + Patch] vs. [Self Compiled] vs. [Self Compiled + MH VBO]

The MH VBO code was the slowest of all of them.

You will need updated versions of wglext.h, and glext.h, plus you have to manually define several GL functions that are used by VertexCache.h\cpp, basically, just look at how the game defines other GL functions for use, and do the same for the new functions.

I also had to replace a conditional:

glConfig.ARBMapBufferRangeAvailable // MH custom code? Not provided..

Replaced with:


I think it's the same thing, but, I'm guessing since I don't have MH's full source...


08.10.2012 22:00

developer   ~0004899

Last edited: 11.03.2014 01:55

I would stop by:

and discuss your changes with Mh.

I am sure he would be quite happy to hear that you are testing

Did you test the changes with a Time Demo like the one that is shipped with TDM?

Reckless's general download page:

compare with Raynorpat's code below...



06.03.2014 22:27

developer   ~0006416

Last edited: 11.03.2014 01:55

(Raynorpat) Even further improvements (BFG's GLSL backend ported):



21.03.2014 02:25

developer   ~0006450

Related OpenGL optimizations:


12.11.2014 18:01

developer   ~0007120

Original work for comparison:


12.11.2014 19:22

developer   ~0007121

Revelator's recent post:


26.11.2016 16:45

developer   ~0008559

Has anyone actually managed to get rid of vbo double buffer? I can see they have been trying to glMapBufferRange but with double buffered vbos it's same speed or worse.


08.09.2017 15:20

developer   ~0009181

Added in rev 7116
Due to be replaced with BFG's implementation:


04.10.2017 15:22

developer   ~0009387

r_useMapBufferRange produces artifacts with com_smp and tdm_lg_interleave > 1.

Issue History

Date Modified Username Field Change
19.09.2012 00:47 nbohr1more New Issue
24.09.2012 22:00 nbohr1more Note Added: 0004850
24.09.2012 22:00 nbohr1more Note Edited: 0004850
24.09.2012 22:02 nbohr1more Note Edited: 0004850
08.10.2012 20:02 CodeMonkey Note Added: 0004897
08.10.2012 22:00 nbohr1more Note Added: 0004899
06.03.2014 22:27 nbohr1more Note Added: 0006416
07.03.2014 00:18 STiFU Relationship added child of 0003684
11.03.2014 01:55 nbohr1more Note Edited: 0004899
11.03.2014 01:55 nbohr1more Note Edited: 0006416
21.03.2014 02:25 nbohr1more Note Added: 0006450
12.11.2014 18:01 nbohr1more Note Added: 0007120
12.11.2014 19:22 nbohr1more Note Added: 0007121
26.11.2016 16:45 duzenko Note Added: 0008559
08.09.2017 15:19 nbohr1more Assigned To => duzenko
08.09.2017 15:19 nbohr1more Severity normal => feature
08.09.2017 15:19 nbohr1more Status new => resolved
08.09.2017 15:19 nbohr1more Resolution open => fixed
08.09.2017 15:19 nbohr1more Fixed in Version => TDM 2.06
08.09.2017 15:19 nbohr1more Target Version => TDM 2.06
08.09.2017 15:20 nbohr1more Note Added: 0009181
04.10.2017 15:22 nbohr1more Note Added: 0009387
25.06.2018 05:52 stgatilov Relationship added related to 0004849