View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0006725 | The Dark Mod | Feature proposal | public | 23.06.2026 20:18 | 28.06.2026 20:23 |
| Reporter | Geep | Assigned To | |||
| Priority | normal | Severity | normal | Reproducibility | have not tried |
| Status | new | Resolution | open | ||
| Product Version | TDM 2.14 | ||||
| Summary | 0006725: Read translation strings directly as UTF8 | ||||
| Description | It would simplify the translation process considerably if the engine could read in the UTF8 all.lang file directly, and the other .lang files could be dispensed with. While a big ask code-wise, this could also provide a modern way forward. Maybe a 2.15 or 2.16 roadmap entry? | ||||
| Additional Information | Possible implementations: 1) Add C++ code to translate from utf8 to iso-8859-x and other encodings. My gen_lang_plus utility has that for iso's. Related bugtracker 3012 also gives a possible code source. 2) More ambitiously, do away with 8-bit encoding entirely (except maybe keyboard entry). UTF8 internally for strings. Instead of using DAT files for fonts, use a different format (let's call it here UDAT), that includes 16-bit Unicode values as the key, instead of implied 8-bit index. Latin and Cyrllic font bitmaps can be merged, since no longer a 256 max character limit. For FMs, if a given all.lang has been lost, it can be easily reassembled from constituent .lang with a utility. | ||||
| Tags | No tags attached. | ||||
| related to | 0003012 | new | Make FM readmes, titles and "More Info" translateable |
|
Another possible implementation: 3) Like 2, but break up all.lang (and do away with it) into individual <language>.ulang files, encoded in utf8. |
|
|
Hmmm, the question of breaking up all.lang is a bit orthogonal to other issues, so let me add another possible implementation: 4) Like 1, but break up all.lang (and do away with it) into individual <language>.ulang files, encoded in utf8. all.lang is an enormous text file, which makes it unwieldy to work with. The reason to keep it whole was that all translation would refer to a common [English] source. If we do break it up, perhaps the convention would be to keep the source english as a big comment within each non-english .ulang file? or maybe some versioning enumeration within the english.ulang, that the other .ulang could reference? |
|