gnartsch on 23/9/2012 at 20:16
When working on a translation for a TDS mission, I realized how terrible it is to read the TDS texts in general.
I came up with the idea for a tool which simply extracts all the text for a given language,
removes all the 'irrelevant' TDS formatting stuff and generates a big tidy, plain text file,
which you can easily read, hand out to someone else for review, or even feed to your favorite spell-checker.
Since the tool will create a specifc file for the specified language, you can easily scan it and find texts which are not properly translated.
The other thing it can do: it checks for common formatting issues, such as
- text not properly enclosed with double quotes
- no language tag found at all for a specific text
- duplicate language tags for the same language
- 'newline' within the text
- bad date/time format
- broken formatting tags
- overlong words
- missing name tag for a text item
- missing 'TextEntry' tag
- ...
All these issues can and do cause text to show up broken or not at all.
Instructions & Download can be found here:
(
http://www.gnartsch.de/TDSTextExtractor/TDSTextExtractor.shtml) TDS Text Extractor & Checker
(Requires Java Runtime, JRE 5.0 or higher)
It might be able to do more in the feature... depending on the feedback I receive.
For sure there are some things I did miss.
What do you think about it ?
Beleg Cúthalion on 27/9/2012 at 08:07
I tried to use it but then I noticed it works only with fully zipped FM archives. Anyway, the idea is probably good. Have you ever used the T3 Text Editor? In case it's not online anymore on the forums I could give you the files I think. This would allow editing texts and could be expanded at some point to work in both directions: Having a WYSIWYG system for creating text files with all their format but also loading them and seeing a plain text version of it.
Oh, sorry, I tend to get imaginative when it comes to software development. :p
gnartsch on 27/9/2012 at 22:31
Quote Posted by Beleg Cúthalion
I tried to use it but then I noticed it works only with fully zipped FM archives. Anyway, the idea is probably good. Have you ever used the T3 Text Editor? In case it's not online anymore on the forums I could give you the files I think. This would allow editing texts and could be expanded at some point to work in both directions: Having a WYSIWYG system for creating text files with all their format but also loading them and seeing a plain text version of it.
Oh, sorry, I tend to get imaginative when it comes to software development. :p
Hi Beleg!
Yes, I have tried T3TextEd. But honestly: I can't cope with it. The functionality it offers is too small for the pain you get in return.
When working on a translation of a really laaaaarge campaign, I ended up using a good text-editor that allows to quickly check for text-references in other files, jump quickly from one file to another, and much much more ... where T3TextEd can not help in any way.
I guess that authors do it just the same way.
That texteditor allowed me to make tremendous progress on translating hundreds of textfiles within a considerably short period of time.
I agree, that a userfriendly WYSIWYG editor for TDS would be a good thing, if it also checks for errors in formatting and such, but no one has created one...
What do you mean with: TDS Extractor & Checker works only on zipped missions ?
It takes no zips. You have to unzip upfront (in case you have a ZIP already) or simply point it to the folder where you keep all your mission files.
And it will analyze whatever .sch file it can find in any folder below the specified directory.
However, it would be recommened to use it on a folder-structure that is 'ready for zipping',
because then it can do some sorting of the files upfront before creating the large text file.
Then it would identify the names of the briefings and analyze & extract the files in this order:
- briefing mission 1
- objectives mission 1
- debriefing mission 1
...
- briefing mission n
- objectives mission n
- debriefing mission n
- all the BOOKS (alphabetically ordered)
- all the files from STRINGS
- any other .sch files that are around just anywhere
snobel on 1/10/2012 at 10:01
Nice tool... A couple of suggestions for the checking part, just in case you get bored: :rolleyes:
- Report tags like "lang_polish". (Those don't really work, do they?)
- It would of course be very useful if it could report if any of the four lines were missing or empty for any TextEntry. (Should probably be optional)
- Would be nice to be able to run the checker part on its own, and have it report problems one per line (filename, line number, problem etc.) with output to the terminal
Btw, what's the deal with newlines in .sch files? Both "\n" and "<br>" are used, but it seems they don't both work anywhere?
A couple of small issues:
- I had no java, so I installed Java 7 and rebooted. But JRE_HOME is not set, so running the .bat file fails. If I run the java command directly it works
- You apparently get an error if there's a schema file in the root of the folder you specify
- Edit: I'm pretty sure that while the "lang" part of the tag is case sensitive, the language name is not? So "lang_English" etc. should be recognized
gnartsch on 1/10/2012 at 18:01
Quote Posted by snobel
Nice tool... A couple of suggestions for the checking part, just in case you get bored: :rolleyes:
- Report tags like "lang_polish". (Those don't really work, do they?)
Makes sense. I will add that as soon as I figure out what the supported languages are.
Quote:
- It would of course be
very useful if it could report if any of the four lines were missing or empty for any TextEntry. (Should probably be optional)
Right now you should receive a WARNING if the entire lang_... thing is entirely missing for the specified language for any textentry or the text is simply blank (e.g lang_english ....
"")
Did you run across such a scenario where it did not give you a warning ?
Quote:
- Would be nice to be able to run the checker part on its own, and have it report problems one per line (filename, line number, problem etc.) with output to the terminal
Good catch ! Will be added soon.
Quote:
Btw, what's the deal with newlines in .sch files? Both "\n" and "<br>" are used, but it seems they don't both work anywhere?
The point here is not about "\n" or "<br>". It is about the user hitting the ENTER key in his editor. TDS treats any newline as the end of the text and any text that follows will be swallowed ... and the user might not notice his error.
Quote:
A couple of small issues:
- I had no java, so I installed Java 7 and rebooted. But JRE_HOME is not set, so running the .bat file fails. If I run the java command directly it works
- You apparently get an error if there's a schema file in the root of the folder you specify
- Edit: I'm pretty sure that while the "lang" part of the tag is case sensitive, the language name is not? So "lang_English" etc. should be recognized
Well... in the past the JRE used to register such a variable, but indeed this does not seem to happen anymore.
The easiest way might be to simply edit the
tdschecker.bat file.
Line 3 reads like :
Quote:
@rem set JRE_HOME="C:\Program Files\Java\jre6"
Figure out where your JRE got installed and change that line to something like :
Quote:
set JRE_HOME="C:\Program Files\Java\jre6"
Take note that you have to delete the leading
@rem !
Case sensitive lang_-tag : Good catch! Yes, the entire thing should probably be case insensitive. Just need to test if this is true or not.
There are probably a couple of more tags that have the same issue.
snobel on 1/10/2012 at 19:21
Quote Posted by gnartsch
Makes sense. I will add that as soon as I figure out what the supported languages are.
There's a list in Default.ini, they're only English, German, French and Italian. It looks like Spanish was planned, I tried adding that to the list once, but I didn't get it to work.
Quote:
Did you run across such a scenario where it did not give you a warning ?
The warnings work fine for the extracted language, but it would be useful to be able to check all 4 supported languages in one go, to identify FMs with unsupported locales. (I'd like to be able to run it from a script to check
all the FMs :ebil:).
Thanks for adding the checking-only mode, maybe the above could be combined with that, since it doesn't make much sense for text extraction?
Quote:
Well... in the past the JRE used to register such a variable, but indeed this does not seem to happen anymore.
I'll probably be running it from a linux script anyway, so no worries...
Quote:
Case sensitive lang_-tag : Good catch! Yes, the entire thing should probably be case insensitive. Just need to test if this is true or not.
There are probably a couple of more tags that have the same issue.
My FM loader code, which runs before the game proper, needs to figure out where to copy the options.ini file during FM installation, so I had a look at this in some detail, to make sure my code and the game would agree... It was a bit surprising that the "lang" part of the tag turned out to be case sensitive, when the language name part of the tag, and the name of the text entry ("T_UserOptionsTitle" in my case), weren't.
gnartsch on 1/10/2012 at 20:54
Quote:
There's a list in Default.ini, they're only English, German, French and Italian. It looks like Spanish was planned, I tried adding that to the list once, but I didn't get it to work.
Indeed it seems like Spanish got dropped before the games was released.
Quote:
The warnings work fine for the extracted language, but it would be useful to be able to check all 4 supported languages in one go, to identify FMs with unsupported locales. (I'd like to be able to run it from a script to check
all the FMs :ebil:).
Thanks for adding the checking-only mode, maybe the above could be combined with that, since it doesn't make much sense for text extraction?
Checking all languages by default was on my list anyway. But there will always be the option to also export the textfiles. Because it
does make sense - I think.
Line numbers are in already as well as console output and language checking.
Oh, and the bug with .sch files at the root folder is fixed already. Thanks for pointing that one out. Exceptions always look sooo terrible...
Just need more testing on the case-sensitive/insensitive stuff and optimize the output a bit.
Quote:
My FM loader code, which runs before the game proper, needs to figure out where to copy the options.ini file during FM installation, so I had a look at this in some detail, to make sure my code and the game would agree... It was a bit surprising that the "lang" part of the tag turned out to be case sensitive, when the language name part of the tag, and the name of the text entry ("T_UserOptionsTitle" in my case), weren't.
That is probably just something they did not bother caring about.
Maybe EIDOS had an Editor which would have handled that....so they never noticed.
Beleg Cúthalion on 2/10/2012 at 14:03
Quote Posted by snobel
There's a list in Default.ini, they're only English, German, French and Italian. It looks like Spanish was planned, I tried adding that to the list once, but I didn't get it to work.
Does "work" refer to "the game is able to read new language tags in schema files if they were set up in the Defaut.ini before"? Because I once downloaded a fan-made Spanish readable pack which IMHO did have Spanish tags in both files (or file groups). And I'm pretty sure I already saw "Russian" as a tag, even though our Russian friends tend to occupy the French tag in T3 FMs.
snobel on 2/10/2012 at 15:41
Quote Posted by Beleg Cúthalion
Does "work" refer to "the game is able to read new language tags in schema files if they were set up in the Defaut.ini before"? Because I once downloaded a fan-made Spanish readable pack which IMHO did have Spanish tags in both files (or file groups). And I'm pretty sure I already saw "Russian" as a tag, even though our Russian friends tend to occupy the French tag in T3 FMs.
You're right, Spanish does work. I tried batch replacing every occurrence of "lang_italian" with "lang_spanish", and changing SupportedLangs and Language in Default.ini. The game still started up in Italian. Doing the same with Russian didn't work though.