Start a new topic

Matching tags is unreliable

I have been struggling with tags in CafeTran for a long time. 


As far as I understand, tags are not stored in CafeTran TMs, but rather their positions are remembered. 


Here is an extract for the Wiki:


Other CAT tools save inline formatting in the TUs of a TMX file. CafeTran only stores the position of formatting, in a property. Both approaches have pros and cons.


Here is an SDL TMX file:

(image)


And this is how CafeTran stores the positions of the character changes:

(image)


When you save a third-party TMX file in CafeTran, inline tags will be removed but their positions will be mapped to the TU's properties so that Exact Matches are possible.


http://beijer.uk/cafetranhelp.com_ARCHIVED/cafetran.wikidot.com/tmx.html

(The entry is about TMX, in case the link does not work some time later)


As it says, both approaches have pros and cons. I wonder what are the pros of this approach?


More important is that such tags are not inserted reliably. I work only with external projects from memoQ, and whenever I insert an exact match from a previous file, or simply delete a translation and insert it again with auto-assembly or by clicking on the Matchboard, tags are almost never inserted correctly. Moreover, clicking on the Matchboard and auto-assembly produce a different result. Some tags are misplaced, some are repeating, some are missing.


These are all red tags (I'm not talking about purple tags here). I have processing of tags enabled, I also import segments from project into a native CafeTran memory. But shouldn't it work with any memory? TMX memories I import from memoQ into CT do not have any marks that they are somehow different.


This is a major issue that prevents exact matching and does not allow reusing previous translations without manual operations. E.g, if I have a file with several thousand segments and lots of tags, and only a part of them are new, I would still need to manually place all the tags in all the segments I already translated, confirmed and checked before. Frankly, I do not understand how to get such a basic feature as exact matching from CafeTran.


I wonder how other users are dealing with this? 


Can this issue be addressed?


3 shades of grey:


image


Never ever in your whole life? Not even on holidays?


To understand the idea in languages I do not know, like Chinese. That's all.


I believe using MT is detrimental to the writing style. I cannot bear its phrasing (I know how it looks, sure).


I'm quite sure you'll be confronted with the results of MT in texts around you, in your daily life.


That's one of the reasons why many text are so creepy.

3 shades of grey:


I have a different approach: I switch off all colors in the source segment, so I cannot benefit from it :)

>I believe using MT is detrimental to the writing style.


When I started using MT, I was afraid of that too. And of the gigantic distraction.


Luckily, the opposite has become true.

Even that I can understand. It took me some time to get used to the colourful fancy fair.

When I started using MT, I was afraid of that too.


It may be useful in some language combinations, not in English to Russian and Ukrainian. The word order is completely different.


I start from a blank page and dictate the first draft.

Dictation friendliness was a major factor for me to use CafeTran in the first place.


(and also the fact that other Mac tools are too bad)


In the process, I also found out that auto suggestion is really great.


Matching/tags/project structure — not so great. Plus there are some bugs too.


Could you please share what are the features that make CafeTran so good for you?

I have just checked now and all the projects I do in CafeTran produce the same wrong results when inserting tags with high fuzzy matches. Much more manual work is required than it would be.


Example: among 9 tags, some red, some purple, only 1 purple tag is inserted, but still in a wrong place (space is inserted after the tag, instead of before it). 


If CafeTran struggles to guess the positions of tags, wouldn't it be better to save them in memory? It would make life easier for everyone. A couple more GB in RAM wouldn't matter. That would be a real simple approach. I believe what is now is an over-complication.

Hello Andrey,


If you don't use Auto-assembling in your language pair/workflow, you could try disabling all related settings in Preferences > Auto-assembling, as they tend to interfere during matches transfer, including tags transfer,


There is a "transfer tags to matches" there. Is it any better with this setting disabled?


In my experience, I am better off without the Auto-assembling features in my line of work.

Hello Jean,


Indeed, after observing the behavior with different auto-assembling settings, I have completely disabled all of them, except "Transfer tags to matches", as I've been thinking it is exactly the feature that does what I want — inserts tags where they belong. 


I have just disabled "Transfer tags to matches", and now no tags are inserted at all.

CafeTran tries to guess the correct tag placement, but this feature is not that efficient… I can only agree, and it's been ages I have disabled it here.


It might be a trade off to disable it, and no tags are inserted if they are not in the TM, but I've found it takes more time to fix incorrectly placed tags that to enter them correctly once.


There are many ways to easily add tags, and tag pairs.


Here is a reminder:


- Type tag number + Esc key = Just type a tag number followed by the Escape key to transfer the corresponding tag.

- Mouse tag placement = When enabled, tags can be added to the target segment simply by left-clicking where you’d like to place a tag. If you select a word or a word string, it is enclosed by two tags. This can be toggled via the Ctlr+Enter shortcut or via the <> button found in the target segment editor.

- List tags = Display a pop-up list of all tags in the source segment. Select and hit enter or click the tag you wish to insert. List tags also enables you to see the exact type of a tag. See the tool tip over each tag in the list. Default shortcut: F3.

- Insert next tag = Transfer the current tag in the list of tags from the source segment to the target segment via the corresponding shortcut or menu action.

- Insert source which includes tags: When you insert the complete source segment or a source text selection (Alt+I) which includes tags, these are also transferred to the target segment.


If you do decide to keep "Transfer tags to matches", you can still choose to remove all tags in the target segment via the Edit > Target segment > Remove tags action.


For tags coming from external projects, adding the ability for CafeTran to match internal tags (tags from internal TMX files, I mean) with external project tags would be great (at least for some external projects such as MemoQ that you mention, as I think it does so for SDL Trados files).


Additionally, in external projects, for segments that start or end with a tag pair (like bold or a link), I think CafeTran only shows one tag (not the starting or ending tag), which makes it difficult to move this tag pair around, as is often the case, depending on the language. This is also a major issue and potential improvement.

Thank you, Jean, this is how I do it now: Type tag number + Esc key = Just type a tag number followed by the Escape key to transfer the corresponding tag.


It would be great indeed to have memoQ tags processed by CafeTran. Hoping for improvements in this area!

I created an memoQ project with bold, italics and a line break.


The memoQ TMX looks like this:


image


I exported to MQXLIFF and opened in CafeTran Espresso. I used the memoQ TMX to insert all EMs. All tags were inserted correctly.


This is how CafeTran Espresso modified the memoQ TMX file:


image


I then removed the targets from the projects, translated the segments anew and inserted CafeTran Espresso style markup for bold and italics:


image


Next segment:


image

And then comes the third segment. I would have expected that the CafeTran Espresso style markup was saved in the memory, thus wrapping the comma that made segment number 3 a fuzzy match:


image


This didn't happen. When I looked at the CafeTran Espresso memory I saw that the markup isn't stored!!!


image


I think that it would be great improvement if this markup would be stored. And if a conversion/marking up of memoQ TMX files would take place.

The files ...

tmx

Does it mean CafeTran does not remember the positions of memoQ tags?

Login to post a comment