Start a new topic

Matching tags is unreliable

I have been struggling with tags in CafeTran for a long time. 

As far as I understand, tags are not stored in CafeTran TMs, but rather their positions are remembered. 

Here is an extract for the Wiki:

Other CAT tools save inline formatting in the TUs of a TMX file. CafeTran only stores the position of formatting, in a property. Both approaches have pros and cons.

Here is an SDL TMX file:


And this is how CafeTran stores the positions of the character changes:


When you save a third-party TMX file in CafeTran, inline tags will be removed but their positions will be mapped to the TU's properties so that Exact Matches are possible.

(The entry is about TMX, in case the link does not work some time later)

As it says, both approaches have pros and cons. I wonder what are the pros of this approach?

More important is that such tags are not inserted reliably. I work only with external projects from memoQ, and whenever I insert an exact match from a previous file, or simply delete a translation and insert it again with auto-assembly or by clicking on the Matchboard, tags are almost never inserted correctly. Moreover, clicking on the Matchboard and auto-assembly produce a different result. Some tags are misplaced, some are repeating, some are missing.

These are all red tags (I'm not talking about purple tags here). I have processing of tags enabled, I also import segments from project into a native CafeTran memory. But shouldn't it work with any memory? TMX memories I import from memoQ into CT do not have any marks that they are somehow different.

This is a major issue that prevents exact matching and does not allow reusing previous translations without manual operations. E.g, if I have a file with several thousand segments and lots of tags, and only a part of them are new, I would still need to manually place all the tags in all the segments I already translated, confirmed and checked before. Frankly, I do not understand how to get such a basic feature as exact matching from CafeTran.

I wonder how other users are dealing with this? 

Can this issue be addressed?

It does remember them. But absolute not relative. So FMs get mistagged
The design decision to leave out basic markup was made before the system to add them in CafeTran was introduced. It would be good if now they are stored in the memories on the fly and inserted during conversion of mQ memories.

Thank you for this research and for clarifications!

1 person likes this



Extended Word document

When I look at how the formatting is stored in the TMX, I wonder whether the TMX specifications don't allow a simple markup for CafeTran Espresso as:

<b>...</b> etc.?

Does it really have to be so complicated as in the memoQ, Studio and Transit TMX?

(4.46 KB)
(4.64 KB)
(26 KB)

Having looked at all my experience with CafeTran thus far, I can say that unreliable matching of tags in full/partial matches is the biggest issue I've come across. The reason is that it does not allow me to use previous matches in tag-intensive external (memoQ) projects. 

But this concerns not only memoQ tags, but also the "purple" tags obtained from replacing non-translatable fragments. They are matched better, but still not 100% correctly. If I replace a non-translatable like a color code with a tag, I do it to save time typing and to avoid possible mistakes in syntax. However, as the tags are not matched reliably, I need to insert them manually or double check if the software did it right and thus spend the time I saved before.

Improving this would save a lot of time. Hence I'd like to ask if this could be addressed.

1 person likes this

Here's another example. When I insert the MT suggestion via the keyboard shortcut, the tag is placed somewhere in the middle of a word:


This is how it should have been done:


It happens to me too, not with MT, but with fuzzy matches. The tag is inserted inside a word, AFAIR between stem and flection.

Login to post a comment