View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0004504 | The Dark Mod | Sound System | public | 05.04.2017 12:37 | 05.06.2018 16:57 |
Reporter | stgatilov | Assigned To | stgatilov | ||
Priority | low | Severity | normal | Reproducibility | N/A |
Status | resolved | Resolution | fixed | ||
Product Version | SVN | ||||
Target Version | SVN | Fixed in Version | TDM 2.06 | ||
Summary | 0004504: Performance of reading OGG from PK4 | ||||
Description | When reading ogg file from pk4 archive, the engine performs much more file reading and zlib decompressing than necessary. For instance, Windows process explorer reports that TDM process reads more than 2.5 GB of data when loading "New Job", while only about 900 MB are really read through IdFile_InZip::Read. | ||||
Additional Information | The engine operates with any file inside a pk4 using idFile_InZip interface. This class has inefficient implementation of ::Seek method. It reads and decompresses the file until the given number of bytes is decompressed (SEEK_CUR with positive offset), or even reopens the file and starts decoding from scratch (SEEK_SET and SEEK_END). Note that all the data read and decompressed during Seek call is immediately discarded. The engine uses ogg vorbis library to load ogg files: idFile_InZip interface is passed to it in order to read directly from inside of pk4 archive. The library works in 'seekable' mode, i.e. it thinks that the input file can be easily seeked through back and forth. So the vorbis library performs a lot of seeks in order to read an ogg file. | ||||
Tags | No tags attached. | ||||
In order to get data, I added logging of all ::Read and ::Seek calls of IdFile_InZip (temporarily). For each call, I saved number of compressed bytes read&decompressed during the call. Here are cumulative numbers for all Read and all Seek calls: {'read': 871032184, 'seek': 1705039195} If we look at the numbers per file extension, only several types of files are actually seeked: 6888556 | .lwo:read 138472732 | .lwo:seek 473887194 | .ogg:read 1558358316 | .ogg:seek 2536785 | .wav:read 859881 | .wav:seek So ogg files cause the most of unnecessary reading&decompressing. Ironically, ogg files are already compressed, so there is no need to compress them inside pk4 files again. Here is the sequence of read+seek calls for a single ogg file (number tells how many bytes were read&decompressed during the call): tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 65536 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 65536 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 seek tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 0 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 65536 read tdm_gui01.pk4/sound/meta/menu/amb_nocturne_menu.ogg 3566560 read |
|
One of the possible improvements for this issue is to store ogg files in uncompressed pk4 archives. However, simply repacking the pk4 files is not enough, because currently both idFile_InZip and underlying unzip implementation (which is some beta version of modern minizip library) do not support seeking directly through uncompressed data. So I copied the adapted version of minizip from dhewm3 repository and replaced the old unzip code with it. Then I copied unzseek function from the latest minizip release (version 1.1), and integrated it into idFile_InZip::Seek method. Lastly, I repacked all tdm_sound_*.pk4 files into pk4 archives without compression. As the result, the total number of bytes read has reduced in process explorer (like from 2.7 GB to 1.5 GB), and the total loading time has reduced from 1:25 to 1:00 on the investigated save. The total size of pk4 files has increased from 592 MB to 610 MB. |
|
I have committed the repacking code. It checks any pk4 file being opened in idFileSystemLocal::LoadZipFile, and if it contains a critical file compressed, then the whole archive is repacked. During repacking, a temporary file is created. Only the files which have wrong compression level are actually uncompressed, all the other files are copied directly without recompression. At the end, the original files is deleted and the temporary files is renamed. |
|
Some exact data about resulting improvement in loading times can be found in this forum post: http://forums.thedarkmod.com/topic/18758-inefficient-ogg-loading-from-pk4-and-minizip-update/?p=404957 |
|
Should we do the same with .jpg's? And I think .dds's are compressed already too? |
|
Jpg and dds files are read on one pass: there is no excessive seeking, like there is for ogg and video files. If you decompress all jpg files, you'll only save on their decompression time. which is likely not much. Here I would prefer approach "do not fix it until it's broken". As for DDS files, they are not compressed (just look inside). The same applies to tga. Removing compression from them would significantly increase size of packages, and probably even slow down loading -- how knows =) |
|
Date Modified | Username | Field | Change |
---|---|---|---|
05.04.2017 12:37 | stgatilov | New Issue | |
05.04.2017 12:37 | stgatilov | Status | new => assigned |
05.04.2017 12:37 | stgatilov | Assigned To | => stgatilov |
05.04.2017 12:44 | stgatilov | Note Added: 0008789 | |
05.04.2017 12:46 | stgatilov | Note Edited: 0008789 | |
05.04.2017 12:53 | stgatilov | Note Added: 0008790 | |
15.04.2017 08:26 | stgatilov | Relationship added | related to 0004507 |
02.05.2017 17:18 | stgatilov | Note Added: 0008825 | |
07.05.2017 02:17 | stgatilov | Relationship added | related to 0004529 |
07.05.2017 02:28 | stgatilov | Note Added: 0008842 | |
07.05.2017 02:29 | stgatilov | Status | assigned => resolved |
07.05.2017 02:29 | stgatilov | Fixed in Version | => TDM 2.06 |
07.05.2017 02:29 | stgatilov | Resolution | open => fixed |
10.06.2017 08:21 | duzenko | Note Added: 0008891 | |
10.06.2017 12:08 | stgatilov | Note Added: 0008892 | |
05.06.2018 16:57 | stgatilov | Relationship added | related to 0004824 |