Start a new topic

Adjusting the settings when using multiple TMs

I'm musing over an article about using several translation memories, with different settings (e.g. TM1 read-only, TM2 read & write).


Is it really so that TM configurations like that can only be set via the Memory menu and not via the Dashboard.


Because the Dashboard is "One size fits all" (any modifications of the settings are applied to all ticked TMs)?


Your digital or analogue input is highly appreciated.


Wiki the Wicking 




I thought that Igor recently changed it, so that you can now set different settings for each of your translation memories from the Dashboard. Or am I wrong?


Michael

It's possible alright (with the reasonable exception of changing a TMX file from Read-Only to Read&Write when open).


For an approach, see TMX Files, an Approach

Also see CafeTran Espresso 2015's Resources, an Overview


H.

Interesting approach. A shame though that you try so hard to discredit glossaries. By the way, there is no thing as a "Termbase" vs "Glossary" distinction in CT. There are only TMs and Glossaries. But you know that of course. Hammering away re those "Termbases" of yours will only serve to confuse people.


Anyway, TMXs and Glossaries each have their own strengths, in terms of using them to store terms and phrases. Neither is perfect.


Ideally, Igor would choose one of them, and implement ALL of the features of both in this one format. It would also be cool if there was then an automatic converter in CT for converting between the two (the new, unified terminology format and the one that was discarded).


How would this work? Imagine that Igor chose to drop TMXs as a place to store terms/phrases in, and gave txt glossaries all the magic of the TMXs. I'm thinking primarily of fuzziness here, as I don't think TMXs have anything else important that glossaries can't do.


So we would then have just TMXs for segments and TXT Glossaries for terms/phrases. A lot clearer to new users. These glossaries would also have fuzziness, availabale with an On/Off switch, and all the features they already have. Sounds great to me.


A lot of that stuff you love so much regarding the juggling of multiple TMXs with different settings, could also be achieved with one or more txt glossaries: just use the fields (Subject/Client/Context) and their respective priority levels. If |gor chose one format for terminology, he could also focus on improving and tweaking one format, rather than having to work on two different, yet related, formats simultaneously. 


Anyway, just some thoughts. Do with them what you will.


 

At the moment, I cannot go into this in great detail. I think I don't really need either, because it's already there.


MB: Imagine that Igor chose to drop TMXs as a place to store terms/phrases in, and gave txt glossaries all the magic of the TMXs.


If Igor wants to do that (and I hope he won't), he'll have to add structure to the txt glossaries. The only way to achieve that that's commonly used and accepted, is to us a mark-up language. Et voilà, A TMX file is a mark-up file. Rather than add structure to TXT files, add "features" to the TMX files. Those features can be implemented almost effortlessly. I'm not in favour of this, I think it's a good thing to keep them separated. I don't want to use regexes in a TMX file, and if I want to use a regex (or synonyms, or whatever), I can use the txt file format for it. Not that I do use them. There are by now 6 pages of hits for "regex" (pages, mind you), and I don't think even one is useful for me. I also think that none of the active users understands them, and worse, I'm afraid I'm the one who comes the closest to understanding them.


MB: Hammering away re those "Termbases" of yours will only serve to confuse people.


In fact, it's you who generated the confusion. TMX and txt files for terms used to be separated, and for a good reason (again, no need to mention it again). At some stage, Igor wanted to abolish the txt glossaries altogether. And for a good reason. There are lots of good reasons to be against the use of txt files for termbases.


But I admit I lost. Things are not going as I want to. As long as I can still use CT without too many crashes and other misery, I'll keep using it. And thank God and Tim for Time Machine, so I can revert to an older version.


H.

HANS: If Igor wants to do that (and I hope he won't), he'll have to add structure to the txt glossaries. The only way to achieve that that's commonly used and accepted, is to us a mark-up language. Et voilà, A TMX file is a mark-up file.


MB: You lose me every time you start talking about the fact that CT glossaries "have no structure" [of course they do!], and that nonsense about markup languages. TMXs and TXT glossaries both have structure, they are just different. Wordfast Classic uses tab-dels for both it's TMs and glossaries btw. In a TMX file, a unit of × is placed inside of tags. In a TXT glossary, a unit of × is places in a column/field. They are both just different ways of achieving the same thing: sticking information on the file. The fact that TMXSs provide fuzzy matching has nothing to do with the structure of the TMX file, but with how Igor has mapped certain elements in said TMX file to routines and processes inside CT. He could just as easily have mapped certain elements in the TXT glossaries to said routines and processes inside CT.


I also think that tab-dels might scale better: take a look at a random selection of identical data (for example a few term pairs and some metadata) stored in (1) a TMX file and (2) a tab-del, and then see how many characters the two file have, or check their size on your disk. I am willing to bet that tab-dels would win in this respect, which might actually make them faster the larger the files become.


Compare this identical entry, e.g.:


TMX

<tu tuid="238" creationdate="20151012T001036Z" creationid="MICHAEL BEIJER">

<prop type="Project">ANB-257974 (patent 1)</prop>

<prop type="Subject">PATENTS</prop>

<tuv xml:lang="nl-NL">

<seg>Vloerkanaal van het conferentiesysteem.</seg>

</tuv>

<tuv xml:lang="en-GB">

<seg>Floor channel of the conferencing system.</seg>

</tuv>

</tu>


TXT:


#nl-NL [TAB] #en-GB [TAB] #Context [TAB] #Subject [TAB] #Client [TAB] #Note [TAB] #Sense [TAB] #Usage example [TAB] #Source [TAB] #URL 

Vloerkanaal van het conferentiesysteem. [TAB] Floor channel of the conferencing system. [TAB] PATENTS


Of cource, the TXT entry is missing a few things, like a creationdate and a creationid, and the info in the header of the TMX (<prop type="x-segments">true</prop><prop type="x-terms">false</prop><prop type="x-processing_tags">true</prop><prop type="x-read_only">false</prop><prop type="x-pretranslate_only">false</prop><prop type="x-terms_consistency_check">false</prop><prop type="x-priority">2</prop><prop type="x-integration">0</prop><prop type="x-case_match">true</prop><prop type="x-duplicates">1</prop><note>size=1238</note>), but this can easily be added a few more columns in the TXT file.


See what I mean? It's all just text.


#################################################################################


HANS: Rather than add structure to TXT files, add "features" to the TMX files. Those features can be implemented almost effortlessly. I'm not in favour of this, I think it's a good thing to keep them separated. I don't want to use regexes in a TMX file, and if I want to use a regex (or synonyms, or whatever), I can use the txt file format for it. Not that I do use them. […]


MB: I have often wondered why no one tried to implement synonyms in TMXs. However, It would probably have to come from the guys who develop the TMX standard(s) [which means it would probably take 19 years and still never get finished], rather than by Igor, because he would then end up with a non-standard TMX.


MB: Hammering away re those "Termbases" of yours will only serve to confuse people.


HANS: In fact, it's you who generated the confusion. TMX and txt files for terms used to be separated, and for a good reason (again, no need to mention it again). At some stage, Igor wanted to abolish the txt glossaries altogether. And for a good reason. There are lots of good reasons to be against the use of txt files for termbases.


MB: And none of them are valid.


Michael

OK. I give up. I gave up loooong ago.


H.

Hi guys,


I really see no point in arguing about TXT vs TMX formats.


1. Both formats are hugely popular now. TMX as a TM exchange format while TXT as a simple user-, editor- and Excel-friendly format.

2. You can store fragments in TMX. Nothing has changed in this regard. Nobody is loosing anything.

3. TXT glossaries with the OPTIONAL regular expressions extension are meant for the exact matching of terms in auto-assembling.

4. Extending the TMX standard in CT is not a perfect idea since such TMs could be exchanged only with other CT users.

5. Extending the TXT is easy and allows to introduce new concepts (e.g. source and target side synonyms, regular expressions).


Igor

Login to post a comment