0003234: Merge Mh's Optimized VBO Code - The Dark Mod Bugtracker

ID	Project	Category	View Status	Date Submitted	Last Update

0003234	The Dark Mod	Coding	public	19.09.2012 00:47	25.06.2018 05:52

Reporter	nbohr1more	Assigned To	duzenko
Priority	normal	Severity	feature	Reproducibility	have not tried
Status	resolved	Resolution	fixed
Platform	ALL	OS	ALL	OS Version	ALL
Product Version	SVN
Target Version	TDM 2.06	Fixed in Version	TDM 2.06

Summary	0003234: Merge Mh's Optimized VBO Code
Description	Quake coder Mh has taken an initial look at modernizing Doom 3's Vertex Cache and VBO Code: "What the glMapBufferRange stuff does is allow you to take advantage of a VBO streaming pattern that D3D has enjoyed since at least version 7 - in D3D terms it's known as the discard/no-overwrite pattern. A VBO is a GPU resource, and normally, if you try to update a GPU resource that is currently in use for drawing with (entirely possible because of the asynchronous nature of CPU/GPU operation), everything must stall and wait for drawing to complete before the update can happen. The stock Doom 3 code actually double-buffers it's streaming VBOs to try avoid this (in a slightly obfuscated way) but glMapBufferRange is a more robust way. So, I mentioned discard/no-overwrite above. Here's what they do. The buffer is filled in a linear manner. You've got 2mb (or whatever) of space, vertexes are added beginning at position 0, as new vertexes are added they get appended until the buffer fills, then magic happens. This standard update is no-overwrite; your code makes a promise to GL that it's not going to overwrite any region of the buffer that may be currently in use for drawing, and in return GL will let you update the buffer without blocking. In order to be able to keep this promise your code must maintain a counter indicating how much space in the buffer it has previously used, and add new verts to the buffer at this counter position. When the buffer becomes full you "discard". This doesn't throw away anything previously added, instead GL will keep the previous block of buffer memory around for as long as is needed to satisfy any pending draw calls, but will give you a new, fresh block for any further updates. That's the "magic" I mentioned above, and it's what lets you use a streaming VBO without any blocking. This pattern will also let you get rid of Doom 3's double buffering, thus saving you some GPU memory (I haven't yet done this in my code). Because there's no more blocking it will run faster in cases where there is a lot of dynamic buffer usage, but because Doom 3 locks at 60fps it may not be as directly measurable as if the engine was unlocked. Hence the "it feels more responsive but I can't quite put my finger on it" result. There's another chunk of code in the standard Alloc call which deals with updates of non-streaming VBOs and which is implemented in quite an evil manner by the stock Doom 3 code. When updating such a VBO you can get a faster update if the glBufferData params are the same as was previously used for that VBO (the driver can just reuse the previous block of buffer memory instead of needing to fully reallocate). Doom 3 doesn't do that, so it doesn't get these faster updates, but by searching the free static headers list for a VBO that matches and using that instead of just taking the first one from it, it can. Obviously it sucks that you need to search the list in this way, and a better implementation would just store the VBO with the object that uses it, and reuse the same VBO each time. Since this mainly happens with model animations an ever better implementation would use transform feedback to animate the model instead of animating it on the CPU and needing to re-upload verts each frame, but I haven't even looked at that yet. So all in all the stock VBO implementation is an unholy mess that needs serious work to get it functioning right, much the same way as Quake 1 lightmap updates were a mess. That code just represents the start of a process.."
Steps To Reproduce	1) Replace VertexCache.cpp and VertexCache.h with the attached Dropbox download https://dl.dropbox.com/u/17706561/VBO.zip 2) Compile 3) Test the results
Tags	No tags attached.

nbohr1more 24.09.2012 22:00 developer ~0004850 Last edited: 24.09.2012 22:02	May require an updated glext.h Here is a newer version that was compiled against Doom 3 that includes the gl_arb_map_buffer_range extension. https://github.com/LogicalError/doom3.gpl/blob/c13851cdcf983355548311d60adb86037910a4c2/neo/renderer/glext.h Or just grab from OpenGL.org http://www.opengl.org/registry/api/glext.h

CodeMonkey 08.10.2012 20:02 reporter ~0004897	I saw no positive FPS results from doing this, in fact, I lost about 4-5 FPS on average. (YMMV, different hardware can influence the results.) I compared against: [Original.exe + Patch] vs. [Self Compiled] vs. [Self Compiled + MH VBO] The MH VBO code was the slowest of all of them. You will need updated versions of wglext.h, and glext.h, plus you have to manually define several GL functions that are used by VertexCache.h\cpp, basically, just look at how the game defines other GL functions for use, and do the same for the new functions. I also had to replace a conditional: glConfig.ARBMapBufferRangeAvailable // MH custom code? Not provided.. Replaced with: R_CheckExtension("GL_ARB_map_buffer_range") I think it's the same thing, but, I'm guessing since I don't have MH's full source...

nbohr1more 08.10.2012 22:00 developer ~0004899 Last edited: 11.03.2014 01:55	I would stop by: http://forums.inside3d.com/viewtopic.php?f=9&t=3491&sid=af5a141b2f9395639118f7f508b2400a&start=540 and discuss your changes with Mh. I am sure he would be quite happy to hear that you are testing this. Did you test the changes with a Time Demo like the one that is shipped with TDM? Reckless's general download page: http://code.google.com/p/realm/downloads/list compare with Raynorpat's code below...

nbohr1more 06.03.2014 22:27 developer ~0006416 Last edited: 11.03.2014 01:55	(Raynorpat) Even further improvements (BFG's GLSL backend ported): http://forums.inside3d.com/viewtopic.php?f=9&t=3491&start=765 https://github.com/raynorpat/morpheus

nbohr1more 21.03.2014 02:25 developer ~0006450	Related OpenGL optimizations: http://blogs.nvidia.com/blog/2014/03/20/opengl-gdc2014/

nbohr1more 12.11.2014 18:01 developer ~0007120	Original work for comparison: http://pastebin.com/rHrwP0nA

nbohr1more 12.11.2014 19:22 developer ~0007121	Revelator's recent post: http://forums.thedarkmod.com/topic/15178-tdm-engine-development-page/page__st__600__p__358097#entry358097

duzenko 26.11.2016 16:45 developer ~0008559	Has anyone actually managed to get rid of vbo double buffer? I can see they have been trying to glMapBufferRange but with double buffered vbos it's same speed or worse.

nbohr1more 08.09.2017 15:20 developer ~0009181	Added in rev 7116 Due to be replaced with BFG's implementation: http://forums.thedarkmod.com/topic/18999-im-working-on-a-vr-version-early-alpha/ https://github.com/fholger/thedarkmod

nbohr1more 04.10.2017 15:22 developer ~0009387	r_useMapBufferRange produces artifacts with com_smp and tdm_lg_interleave > 1.

Date Modified	Username	Field	Change
19.09.2012 00:47	nbohr1more	New Issue
24.09.2012 22:00	nbohr1more	Note Added: 0004850
24.09.2012 22:00	nbohr1more	Note Edited: 0004850
24.09.2012 22:02	nbohr1more	Note Edited: 0004850
08.10.2012 20:02	CodeMonkey	Note Added: 0004897
08.10.2012 22:00	nbohr1more	Note Added: 0004899
06.03.2014 22:27	nbohr1more	Note Added: 0006416
07.03.2014 00:18	STiFU	Relationship added	child of 0003684
11.03.2014 01:55	nbohr1more	Note Edited: 0004899
11.03.2014 01:55	nbohr1more	Note Edited: 0006416
21.03.2014 02:25	nbohr1more	Note Added: 0006450
12.11.2014 18:01	nbohr1more	Note Added: 0007120
12.11.2014 19:22	nbohr1more	Note Added: 0007121
26.11.2016 16:45	duzenko	Note Added: 0008559
08.09.2017 15:19	nbohr1more	Assigned To	=> duzenko
08.09.2017 15:19	nbohr1more	Severity	normal => feature
08.09.2017 15:19	nbohr1more	Status	new => resolved
08.09.2017 15:19	nbohr1more	Resolution	open => fixed
08.09.2017 15:19	nbohr1more	Fixed in Version	=> TDM 2.06
08.09.2017 15:19	nbohr1more	Target Version	=> TDM 2.06
08.09.2017 15:20	nbohr1more	Note Added: 0009181
04.10.2017 15:22	nbohr1more	Note Added: 0009387
25.06.2018 05:52	stgatilov	Relationship added	related to 0004849

View Issue Details

Relationships

Activities

Issue History

related to	0004849	resolved	cabalistic	Replace VBO implementation with the one similar to D3BFG
child of	0003684	new		Investigate GPL Renderer Improvements