View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0004848 | The Dark Mod | Coding | public | 23.06.2018 16:19 | 30.01.2024 22:22 |
Reporter | duzenko | Assigned To | duzenko | ||
Priority | high | Severity | crash | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Product Version | SVN | ||||
Target Version | TDM 2.07 | Fixed in Version | TDM 2.07 | ||
Summary | 0004848: Crashes when quick loading | Lightgem as Subview | ||||
Description | Not sure if it's related to the VBO changes, but still assigning to @cabalistic just in case. | ||||
Steps To Reproduce | I usually crash the second or third time I quick load on maps like Inn Business or Rightful Property. | ||||
Additional Information | read access violation. backEnd.vLight->lightDef was 0x49BC71E8. > TheDarkModNoTools.exe!RB_CreateSingleDrawInteractions(const drawSurf_s * surf) Line 801 C++ TheDarkModNoTools.exe!RB_GLSL_CreateDrawInteractions(const drawSurf_s * surf) Line 179 C++ [Inline Frame] TheDarkModNoTools.exe!RB_GLSL_DrawLight_Stencil() Line 261 C++ TheDarkModNoTools.exe!RB_GLSL_DrawInteractions() Line 346 C++ TheDarkModNoTools.exe!RB_STD_DrawView() Line 1284 C++ TheDarkModNoTools.exe!RB_DrawView() Line 872 C++ TheDarkModNoTools.exe!RB_ExecuteBackEndCommands(const emptyCommand_t * cmds) Line 593 C++ [Inline Frame] TheDarkModNoTools.exe!R_IssueRenderCommands(frameData_t * frameData) Line 135 C++ TheDarkModNoTools.exe!idRenderSystemLocal::EndFrame(int * frontEndMsec, int * backEndMsec) Line 645 C++ TheDarkModNoTools.exe!idSessionLocal::UpdateScreen(bool outOfSequence) Line 2648 C++ [Inline Frame] TheDarkModNoTools.exe!idSessionLocal::ShowLoadingGui() Line 480 C++ TheDarkModNoTools.exe!idSessionLocal::ExecuteMapChange(bool noFadeWipe) Line 1520 C++ TheDarkModNoTools.exe!idSessionLocal::LoadGame(const char * saveName) Line 2173 C++ TheDarkModNoTools.exe!LoadGame_f(const idCmdArgs & args) Line 1780 C++ TheDarkModNoTools.exe!idCmdSystemLocal::ExecuteTokenizedString(const idCmdArgs & args) Line 482 C++ TheDarkModNoTools.exe!idCmdSystemLocal::ExecuteCommandBuffer() Line 662 C++ TheDarkModNoTools.exe!idEventLoop::RunEventLoop(bool commandExecution) Line 173 C++ TheDarkModNoTools.exe!idCommonLocal::Frame() Line 2426 C++ TheDarkModNoTools.exe!WinMain(HINSTANCE__ * hInstance, HINSTANCE__ * hPrevInstance, char * lpCmdLine, int nCmdShow) Line 1371 C++ | ||||
Tags | No tags attached. | ||||
Attached Files | |||||
related to | 0006278 | new | Func_Peek lightgem glitch |
Afaik it was reported by a user for 2.06 already, wasn't it? If so, it's not related to VBO. Unfortunately, I cannot reliably reproduce this. I've seen it happen once or twice, but whenever I try to capture it in the debugger, I can reload again and again and nothing's happening :( |
|
What's puzzling for me, though, is why would you have any active lights if showing the map loading screen? Sounds to me like it's stuck somewhere in a transition? | |
Save game attached. It has active lights when map is about to show the loading GUI. But at that point all light defs have already been destroyed. |
|
#define USE_LIBC_MALLOC 0 appears to work around it. Looks like it is an old bug, but it did not manifest until recently because idHeap masked it. |
|
At revision: 7514 | |
Now the game crashes during quickload again (Inn Business, Release x64) The call stack is: > TheDarkModx64.exe!RB_ExecuteBackEndCommands(const emptyCommand_t * cmds) Line 573 C++ TheDarkModx64.exe!R_IssueRenderCommands(frameData_t * frameData) Line 138 C++ TheDarkModx64.exe!LightGem::Render() Line 348 C++ TheDarkModx64.exe!idRenderSystemLocal::EndFrame(int * frontEndMsec, int * backEndMsec) Line 643 C++ TheDarkModx64.exe!idSessionLocal::UpdateScreen(bool outOfSequence) Line 2650 C++ [Inline Frame] TheDarkModx64.exe!idSessionLocal::ShowLoadingGui() Line 480 C++ TheDarkModx64.exe!idSessionLocal::ExecuteMapChange(bool noFadeWipe) Line 1522 C++ TheDarkModx64.exe!idSessionLocal::LoadGame(const char * saveName) Line 2175 C++ TheDarkModx64.exe!LoadGame_f(const idCmdArgs & args) Line 1781 C++ In the switch line: while (cmds) { switch ( cmds->commandId ) { case RC_NOP: It says cmds is 0x28000000000, which is access violation. |
|
I have tried to understand what is going on here. So there are frameData_t of various kind. Two of such frameData-s are located in backend (tr_main.cpp): frameData_t smpFrameData[2]; With some pointers referencing these two frames. Each of these two frameData-s has a linked list of commands: emptyCommand_t *cmdHead, *cmdTail; They allocated from the corresponding frame memory. Also, there are two lightgem datas in LightGem: lightGemDrawCmds_t m_LightgemDrawCmds[2]; And each of them has its own commands linked list. When time comes, commands list of lightgem is temporarily patched into the backend's command list, only to add a few commands to the list (I guess to reuse global backend functions). However, lightgem commands are still allocated from the same two chunks of frame memory as ordinary backend commands are. This creates a dangerous situation when someone can add one more frame switch in one place without clearing the other commands list, and new commands will start randomly overwriting the commands still referenced in the other list. I don't know how tied together these things are, but I feel this is not very good. Ideally, lightgem commands should be allocated from their own buffers. |
|
Ah, hm. That's probably something I messed up, I'll look into it. However, smpFrameData is the general-purpose allocation buffer for a single frame. It's not just for commands, but it's basically to allocate anything that lives and is valid for a single frame. The separate lightgem commands are a bit of a hack, yes. That's mostly because you can't directly add them to the standard command queue due to subtle assumptions that would be violated. And unfortunately, due to the stateful backend, you have to replace the standard queue with the lightgem one to render it, or refactor a good part of the backend to not be so stateful. But the lightgem commands are still tied to the same frame as the standard command queue. So it is correct that they draw from the same frameData allocation pool. If that leads to crashes due to incorrect cleanup of the lightgem, then that needs to be fixed instead. Introducing a separate frameData buffer might fix the crash, but it would still mean that data meant for a different frame is used in the wrong place, so it's not the right fix, imho. |
|
The problem is caused by svn rev 7514, which was supposed to fixquickload crashes. Before that quickloading crashed for a different reason, I guess. |
|
Do we want to convert lightgem to a regular subview so that it's drawn just like everything else in TDM? It would remove all the hacks accumulated over the years. ... Let me try that locally and I will post my version to github for discussion. |
|
Proposed changes (this fixes the quick load crash to me) https://github.com/duzenko/darkmod-experiments/commit/4726d715b5fece44f5467001625e0f42f9da3bb4 |
|
I thought it always was a subview? Or did the IsLightGem() conversion change that? Code looks good to me but do optimizations like this still work? renderworld_portals.cpp if ( !entity->parms.isLightgem && entity->parms.noShadow && tr.viewDef->IsLightGem() ) continue; |
|
What is subview exactly? What is already working via subviews in TDM? |
|
A subview is a render like mirror, portal sky, or remote camera. I.e. a render separate to "main" player's eyes view. Subviews are rendered inside the same backend run as the main view. It's sometimes copied to texture for use in the main view. Before 2.06 we only had xray, remote camera, and mirror as subviews. Portal sky and lightgem were done as separate backend calls due to doom3 source code closed at the time of their implementation. In 2.06 I moved portal sky to the subview system. Now it's lightgem turn. By moving to subviews I mean rendering it in the same backend run so that we don't need to maintain the separate backend data for lightgem. I think the lightgem optimization should still work because I implicitly mark the subview's viewId as lightgem. It looks good when I view it with r_subviewsONly 1 and the lightgem HUD seems to change adequately. Things TODO is revisiting/reimplementing the "hand held" objects influence, split/interleave, and source code level object links (I moved some things from private to public and wonder if it could be done better) |
|
I have no idea how far this change goes. I tested two maps with this patch, doing many quicksaves/quickloads. No crashes at all. Also, did not notice any problems with lightgem. Probably wait for Cabalistic's opinion? P.S. When I first came to TDM, lightgem was working via taking screenshot to named pipe: 0002768 =) |
|
In principle, it sounds like a way better idea than what I hacked together to get the lightgem onto the frontend/backend split. I wasn't knowledgable enough about the subviews system to attempt it myself. I'll have a look at the code later :) |
|
At revision: 7532 | |
I left you a few comments on the Github commit. Good work, overall. I only have one concern about certain functionality you commented out. And also, please let's remove the old implementation instead of commenting it out, especially those parts you moved into a new function. That's what source control is for :) | |
@grayman asked me to be careful with deletions. I should have left a note in the comment saying to delete after 2.07 release. | |
But in this case you basically just moved functionality and removed a hack I only introduced with the BFG changes. So those changes were never even "productive". No point in letting them in the code for 2.07. I think the issue with quickloads related to the lightgem rendering is fixed now. But this is a change that was introduced after 2.06. Are we confident the original problem is understood and fixed, as well? |
|
I suppose Grayman was mainly talking about deleting the code which you did NOT write, especially some code which is present in TDM for a long time and does not make any harm to anyone. If you MOVE code, then you should remove it at the old location, otherwise you make things more confusing =) Anyway, I have a question: why exactly do we need two RAM buffers for lightgem contents? And why just creating two buffers is enough for fixing it? I think this issue should be marked as resolved now. If someone else has quickloading crashes in the nearest future, he can reopen it. As for why 2.06 was sometimes crashing, I'm afraid it is useless to seek the answer on the current SVN version. If you really like to know the historic answer, you can checkout 2.06 branch and investigate. |
|
Rendering the lightgem now happens on the backend, while analysis of the result happens in the frontend. Since both can run in parallel, they should not use the same buffer. So one buffer belongs to the frontend to analyse while backend copies results of the current frame lightgem render to the other buffer. | |
And why this double buffering is not necessary for other subviews then? | |
Because they don't produce feedback to the frontend. Lightgem is unique in that we read back the results of the render to the system RAM and decide based on the contents how dark or light the lightgem should be. The other subviews just produce partial results that are then used in the remaining rendering pipeline on the backend. | |
The double-buffer change seems to have cured the flickering AI issue... However, now No Honor Among Thieve's crashes to the console after clicking the mission start prompt. VertexCache static vertex buffer uploaded, memory used: 40563 kb VertexCache static index buffer uploaded, memory used: 9256 kb Generated framebuffer DEPTH_STENCIL attachment: 1366x768 Generated framebuffer DEPTH_STENCIL attachment: 1366x768 speaker_zone_ambient::init() called Color read from main ambient light 'ambient_world': 0.05 0.05 0.08 Changed location from '' to 'town'. --------- Game Map Shutdown ---------- WARNING:Door VaultDoor2 is not within a valid AAS area WARNING:Door treasurelid2 is not within a valid AAS area WARNING:Door safe_door_1 is not within a valid AAS area WARNING:Door atdm_mover_door_3 is not within a valid AAS area ModelGenerator memory: 21 LOD entries with 31 users using 10206 bytes, memory s aved: 13341 bytes. --------- Game Map Shutdown done ----- ERROR:idRenderWorld::UpdateEntityDef: index = -1 -------------------------------------- ]condump nhat_crash.txt Dumped console text to nhat_crash.txt. |
|
Same issue in "The Transaction" which also starts with a scripted cut-scene. | |
The double buffers we were talking about are not related to anything visual (except for the lightgem value), so I don't see how they would have anything to do with AI flickering. I did increase the initial amount of vertex cache, though, so if the flickering was due to having to resize the cache, that might have been it. You could experiment with the relevant CVars, r_frameVertexMemory and r_frameIndexMemory. For me, the resize is effectively flicker-free, but I'm running fairly high frame rates so that might be hiding the effect. |
|
As for the cutscene errors: I found the cause. The lightgem render is trying to hide the player from view for the lightgem render, but the player apparently does not exist during those cutscenes. We probably missed a conditional when moving the lightgem to a subview to skip the lightgem in these circumstances. I will double-check the changes. | |
Should be fixed with r7553. | |
Thank you. No crash in either NHAT or Transaction, including the 2nd cut-scene in Transaction. With a good 20 minutes of testing, only one instance of a flickering AI happened and he flickered once and nearly imperceptibly. This is golden. |
|
Do we want to interleave lightgem rendering? I added tdm_lg_interleave as check for LG toggle for now. |
|
Personally, I don't think it's worth it anymore. From my nSight profilings, the lightgem rendering inside the normal backend operations is a very minor part of the frame render, and doing it only every n-th frame isn't going to save any significant processing or rendering time. As such I think it only makes the code more complicated and adds another potential for subtle lightgem bugs? | |
Hmm. The users who benefit from lg_interleave are single core users. They only benefit if lg_split is active as well. So we have these questions: 1) How many single core users do we have? 2) Of these users, how many have very poor performance in 2.06? 3) Of these users, how many have tried LG interleave options? That said, I am guessing it wouldn't be too hard to add lg_split back and enable it only when com_smp = 0? |
|
Ever since I divided the lightgem rendering into frontend/backend part (and also now with the subview), it only renders one lightgem image per frame, and it alternates up or down between frames. So the old lg_split option is gone, but technically always active. | |
Thanks, that narrows down things a bit. If we keep lg_interleave, we should still constrain it to com_smp 0. (If we are going to keep offering a single core mode...) |
|
I have one question: is this thing done? Did anyone experience savegame crashes lately? The work about moving lightgem into subview is also long done, I guess. If the core things are done, it's better to resolve the issue (you can move any follow-up tasks into separate issues). |
|
Date Modified | Username | Field | Change |
---|---|---|---|
23.06.2018 16:19 | duzenko | New Issue | |
23.06.2018 16:19 | duzenko | Status | new => assigned |
23.06.2018 16:19 | duzenko | Assigned To | => cabalistic |
23.06.2018 16:33 | cabalistic | Note Added: 0010567 | |
23.06.2018 16:35 | cabalistic | Note Added: 0010568 | |
24.06.2018 11:34 | duzenko | Assigned To | cabalistic => duzenko |
24.06.2018 12:12 | duzenko | File Added: savegames.zip | |
24.06.2018 12:15 | duzenko | Note Added: 0010577 | |
24.06.2018 16:11 | duzenko | Note Added: 0010579 | |
24.06.2018 17:48 | duzenko | Note Added: 0010580 | |
28.06.2018 17:35 | stgatilov | Note Added: 0010609 | |
28.06.2018 17:44 | stgatilov | Note Edited: 0010609 | |
29.06.2018 16:57 | stgatilov | Note Added: 0010610 | |
29.06.2018 16:58 | stgatilov | Note Edited: 0010610 | |
29.06.2018 16:58 | stgatilov | Note Edited: 0010610 | |
29.06.2018 16:58 | stgatilov | Note Edited: 0010610 | |
29.06.2018 23:38 | cabalistic | Note Added: 0010611 | |
30.06.2018 03:30 | stgatilov | Note Added: 0010612 | |
30.06.2018 03:30 | stgatilov | Note Edited: 0010612 | |
30.06.2018 05:23 | duzenko | Note Added: 0010616 | |
30.06.2018 05:25 | duzenko | Note Edited: 0010616 | |
30.06.2018 07:27 | duzenko | Note Edited: 0010616 | |
30.06.2018 09:38 | duzenko | Note Added: 0010621 | |
30.06.2018 15:19 | nbohr1more | Note Added: 0010622 | |
30.06.2018 16:06 | stgatilov | Note Added: 0010623 | |
30.06.2018 16:19 | duzenko | Note Added: 0010624 | |
30.06.2018 17:47 | stgatilov | Note Added: 0010625 | |
30.06.2018 17:48 | stgatilov | Note Edited: 0010625 | |
30.06.2018 17:49 | stgatilov | Note Edited: 0010625 | |
30.06.2018 17:51 | cabalistic | Note Added: 0010626 | |
01.07.2018 07:57 | duzenko | Note Added: 0010630 | |
01.07.2018 08:26 | cabalistic | Note Added: 0010631 | |
01.07.2018 08:31 | duzenko | Note Added: 0010632 | |
01.07.2018 09:03 | cabalistic | Note Added: 0010633 | |
07.07.2018 07:26 | stgatilov | Note Added: 0010649 | |
07.07.2018 07:28 | stgatilov | Note Edited: 0010649 | |
07.07.2018 07:31 | cabalistic | Note Added: 0010650 | |
07.07.2018 07:43 | stgatilov | Note Added: 0010651 | |
07.07.2018 07:46 | cabalistic | Note Added: 0010652 | |
07.07.2018 12:21 | nbohr1more | Note Added: 0010654 | |
07.07.2018 12:24 | nbohr1more | Note Added: 0010655 | |
07.07.2018 12:32 | cabalistic | Note Added: 0010656 | |
07.07.2018 12:58 | cabalistic | Note Added: 0010657 | |
07.07.2018 13:14 | cabalistic | Note Added: 0010658 | |
07.07.2018 16:51 | nbohr1more | Note Added: 0010659 | |
08.07.2018 09:29 | duzenko | Summary | Crashes when quick loading => Crashes when quick loading | Lightgem as Subview |
08.07.2018 09:30 | duzenko | Note Added: 0010660 | |
08.07.2018 09:46 | cabalistic | Note Added: 0010661 | |
08.07.2018 13:37 | nbohr1more | Note Added: 0010662 | |
08.07.2018 13:40 | cabalistic | Note Added: 0010663 | |
08.07.2018 13:54 | nbohr1more | Note Added: 0010664 | |
19.07.2018 08:55 | duzenko | Severity | crash => tweak |
29.07.2018 15:10 | stgatilov | Note Added: 0010732 | |
30.07.2018 13:55 | nbohr1more | Priority | normal => high |
30.07.2018 13:55 | nbohr1more | Severity | tweak => crash |
30.07.2018 13:55 | nbohr1more | Status | assigned => resolved |
30.07.2018 13:55 | nbohr1more | Resolution | open => fixed |
30.07.2018 13:55 | nbohr1more | Fixed in Version | => TDM 2.07 |
30.01.2024 22:22 | stgatilov | Relationship added | related to 0006278 |