Start a new topic

Matching tags is unreliable

I have been struggling with tags in CafeTran for a long time. 

As far as I understand, tags are not stored in CafeTran TMs, but rather their positions are remembered. 

Here is an extract for the Wiki:

Other CAT tools save inline formatting in the TUs of a TMX file. CafeTran only stores the position of formatting, in a property. Both approaches have pros and cons.

Here is an SDL TMX file:


And this is how CafeTran stores the positions of the character changes:


When you save a third-party TMX file in CafeTran, inline tags will be removed but their positions will be mapped to the TU's properties so that Exact Matches are possible.

(The entry is about TMX, in case the link does not work some time later)

As it says, both approaches have pros and cons. I wonder what are the pros of this approach?

More important is that such tags are not inserted reliably. I work only with external projects from memoQ, and whenever I insert an exact match from a previous file, or simply delete a translation and insert it again with auto-assembly or by clicking on the Matchboard, tags are almost never inserted correctly. Moreover, clicking on the Matchboard and auto-assembly produce a different result. Some tags are misplaced, some are repeating, some are missing.

These are all red tags (I'm not talking about purple tags here). I have processing of tags enabled, I also import segments from project into a native CafeTran memory. But shouldn't it work with any memory? TMX memories I import from memoQ into CT do not have any marks that they are somehow different.

This is a major issue that prevents exact matching and does not allow reusing previous translations without manual operations. E.g, if I have a file with several thousand segments and lots of tags, and only a part of them are new, I would still need to manually place all the tags in all the segments I already translated, confirmed and checked before. Frankly, I do not understand how to get such a basic feature as exact matching from CafeTran.

I wonder how other users are dealing with this? 

Can this issue be addressed?

I am not frustrated ™ 

Another frustrated user here about tag placement with external projects. I've set CTE to show boundary tags in the target, but even with 100% matches they are not inserted correctly. Result: too many clicks even with 100% matches!

1 person likes this

It happens to me too, not with MT, but with fuzzy matches. The tag is inserted inside a word, AFAIR between stem and flection.

Here's another example. When I insert the MT suggestion via the keyboard shortcut, the tag is placed somewhere in the middle of a word:


This is how it should have been done:


Having looked at all my experience with CafeTran thus far, I can say that unreliable matching of tags in full/partial matches is the biggest issue I've come across. The reason is that it does not allow me to use previous matches in tag-intensive external (memoQ) projects. 

But this concerns not only memoQ tags, but also the "purple" tags obtained from replacing non-translatable fragments. They are matched better, but still not 100% correctly. If I replace a non-translatable like a color code with a tag, I do it to save time typing and to avoid possible mistakes in syntax. However, as the tags are not matched reliably, I need to insert them manually or double check if the software did it right and thus spend the time I saved before.

Improving this would save a lot of time. Hence I'd like to ask if this could be addressed.

1 person likes this



Extended Word document

When I look at how the formatting is stored in the TMX, I wonder whether the TMX specifications don't allow a simple markup for CafeTran Espresso as:

<b>...</b> etc.?

Does it really have to be so complicated as in the memoQ, Studio and Transit TMX?

(4.46 KB)
(4.64 KB)
(26 KB)

Thank you for this research and for clarifications!

1 person likes this
The design decision to leave out basic markup was made before the system to add them in CafeTran was introduced. It would be good if now they are stored in the memories on the fly and inserted during conversion of mQ memories.
It does remember them. But absolute not relative. So FMs get mistagged

Does it mean CafeTran does not remember the positions of memoQ tags?

The files ...


I created an memoQ project with bold, italics and a line break.

The memoQ TMX looks like this:


I exported to MQXLIFF and opened in CafeTran Espresso. I used the memoQ TMX to insert all EMs. All tags were inserted correctly.

This is how CafeTran Espresso modified the memoQ TMX file:


I then removed the targets from the projects, translated the segments anew and inserted CafeTran Espresso style markup for bold and italics:


Next segment:


And then comes the third segment. I would have expected that the CafeTran Espresso style markup was saved in the memory, thus wrapping the comma that made segment number 3 a fuzzy match:


This didn't happen. When I looked at the CafeTran Espresso memory I saw that the markup isn't stored!!!


I think that it would be great improvement if this markup would be stored. And if a conversion/marking up of memoQ TMX files would take place.

Thank you, Jean, this is how I do it now: Type tag number + Esc key = Just type a tag number followed by the Escape key to transfer the corresponding tag.

It would be great indeed to have memoQ tags processed by CafeTran. Hoping for improvements in this area!

CafeTran tries to guess the correct tag placement, but this feature is not that efficient… I can only agree, and it's been ages I have disabled it here.

It might be a trade off to disable it, and no tags are inserted if they are not in the TM, but I've found it takes more time to fix incorrectly placed tags that to enter them correctly once.

There are many ways to easily add tags, and tag pairs.

Here is a reminder:

- Type tag number + Esc key = Just type a tag number followed by the Escape key to transfer the corresponding tag.

- Mouse tag placement = When enabled, tags can be added to the target segment simply by left-clicking where you’d like to place a tag. If you select a word or a word string, it is enclosed by two tags. This can be toggled via the Ctlr+Enter shortcut or via the <> button found in the target segment editor.

- List tags = Display a pop-up list of all tags in the source segment. Select and hit enter or click the tag you wish to insert. List tags also enables you to see the exact type of a tag. See the tool tip over each tag in the list. Default shortcut: F3.

- Insert next tag = Transfer the current tag in the list of tags from the source segment to the target segment via the corresponding shortcut or menu action.

- Insert source which includes tags: When you insert the complete source segment or a source text selection (Alt+I) which includes tags, these are also transferred to the target segment.

If you do decide to keep "Transfer tags to matches", you can still choose to remove all tags in the target segment via the Edit > Target segment > Remove tags action.

For tags coming from external projects, adding the ability for CafeTran to match internal tags (tags from internal TMX files, I mean) with external project tags would be great (at least for some external projects such as MemoQ that you mention, as I think it does so for SDL Trados files).

Additionally, in external projects, for segments that start or end with a tag pair (like bold or a link), I think CafeTran only shows one tag (not the starting or ending tag), which makes it difficult to move this tag pair around, as is often the case, depending on the language. This is also a major issue and potential improvement.

Hello Jean,

Indeed, after observing the behavior with different auto-assembling settings, I have completely disabled all of them, except "Transfer tags to matches", as I've been thinking it is exactly the feature that does what I want — inserts tags where they belong. 

I have just disabled "Transfer tags to matches", and now no tags are inserted at all.

Login to post a comment