Does anyone here have any tips re easy ways of converting CafeTran TXT glossaries with synonyms into TMXs?
I tried the awk script mentioned here: http://cafetran.wikidot.com/using-source-side-synonyms (which I found thanks to Masata's post here: https://cafetran.freshdesk.com/support/discussions/topics/6000013801), but am getting an error.
I don't know how to run the Perl script.
Try installing http://strawberryperl.com
I of course applaud what you're trying to achieve. Since most scripts/apps that are useful for text manipulation seem to have been developed for UNIX - Perl, sed, awk, grep - I thought I'd point you to StrawberryPerl. All of those scripts can run under Windows, but not straight from the command line (I think). If it's not working, I'll be happy to try running the scripts, but as you will understand, I don't have any tab dels with synonyms, so you'll have to send me an example that reflects your structure of those files.
Thanks for the Perl link! I used to have Perl running using some other Windows thingee, something like Active Perl, or ‘Active’ something or other, can't remember. Anyway, got side-tracked (as usual), and am working on something else at the moment ...
However, I actually managed to solve it differently: I did it in Ron's Editor, which has a handy Column > Split function, so I split the src and trgt columns into separate columns (using the semicolons), and then via a little copy-pasting, created a tab-del with all the synonyms onder elkaar. Then I converted this to a TMX in Heartsome's TMX editor via Convert to TMX.
However, I think it would be really useful if there was an automatic (i.e. quick & easy) converter for this (Convert Glossary <-> Memory for terms), built into CT. I know how much you love features, so I requested it from Igor.
MB: I actually managed to solve it differently
MB: ... automatic (i.e. quick & easy) converter for this (Convert Glossary <-> Memory for terms), built into CT...
But, but... it's already there. Just import/export them files. Better than using TMXEditor actually, considering your love for BIG things (TMXEditor can't handle large TMX files without splitting them first).
Yeah, I forgot that CT can of course also convert between glossaries and memories for terms (via Memory > Import ... e.g.). However, that's not what I meant: I meant a built in converter in CT that also respects synonyms.
MB: I meant a built in converter in CT that also respects synonyms.
Hans CafeTran Wiki: Why are you using a TM for terms, Michael?
I can think of a few reasons:
I was experimenting with importing a project specific TXT glossary into my project TMX, to see what the difference would be like in practice.
Hans (woorden) = black
Michael B = green
I can think of a few reasons:
• He needs to share the terms with a colleague (if I wanted to share terms I would definitely not send someone a TMX; I would probably send an Excel file; most translators don't even have a TMX editor installed or know what it is)
• He needs to use them in another CAT tool (most other CAT tools prefer to import terminology from an Excel file or a delimited text file (usually .txt or .csv))
• He wants to use them in Recall (nope)
• He wants to see if he can benefit from them in Slate Desktop (I don't yet know what format Slate's dictionaries will be in, or if it will even have the ability to use dictionaries/glossaries, but i doubt it will be in TMX format; however, who knows, it might)
• He finally came to the conclusion that I am right (which is of course actually the prime reason)(definitely nope)
Speaking more generally though, I think that having a converter that can convert between terminology containers and translation memory containers, and one that can respect synonyms, would be a valuable asset to CT. The ability to convert a glossary containing synonyms into a TMX (or vice versa) is actually something that other users of other CAT tools may also find useful. They would of course have to fiddle around a bit to get their format into the CafeTran format, or vice versa (e.g., memoQ separates its synonyms differently in its delimited CSV terminology containers, but it isn't all too difficult to convert that into the CafeTran glossary format), but I can see how this could be a selling point. Currently, there don't seem to be any good converters that can do this.
I was thinking more in "terms" of getting rid of those synonyms and regular expressions, and then turning them in to TMX for fuzziness and more control. For exchange purposes, a one-click conversion is already available.
MB: but i doubt it will be in TMX format
I don't think the dictionaries are, but I do think you can import TMX files (that will then be converted, like in Total Recall). We'll see. You'll see. I won't.
As far as I can see, a "package" for Windows already exists, not supported by the Moses project, but I don't think Tom's Slate Desktop is either.
@Hans CafeTran Wiki:
It is not a ‘one-man, one-time experiment’ that leads me to believe that this feature would be useful, but the fact that it is something general, and very important.
CT has two formats to store terminology in. It would therefore not be so strange if CT also had a way to convert back and forth between these two. This seems pretty reasonable to me, and something that users (new and old) might actually expect and find useful.
A possible use case: after listening to the older users argue endlessly about the old Glossaries vs Memories for Terms issue, a new user decides he/she would like to try one of them out, but can't figure out how to convert their current choice of terminology format into the other format. They look around in the menus for a Converter, but cannot find one.
This is not some crazy feature that no one will ever use or understand, but basic (missing) functionality.