Start a new topic

Problem with auto-assembling (or wong settings?)

So I'm translating a file, but auto-assembling is prioritising nonsense fragment matches from the TM over glossary entries, despite the glossary priority being set to 'high' and the 'High priority for glossaries' box being checked on the 'Auto-assembly' preferences panel.

What am I doing wrong or is this a bug?

(Note: I'm still on about Update 11, but there's been nothing on this in the release notes, so I assume that's not the issue.)


1 person likes this idea

The most likely causes are:


  1. The entry in the TM is longer. Longer entries prevail. This is an old problem, nothing you can do about it, not a CT problem.
  2. The hit is a "virtual match." You can change the threshold of virtual matches in the Preferences.
H

Hi Hans,

thanks for your reply. Excuse me if I'm being obtuse, but:

On point 1. why is 'longer entries prevail' not a CT problem - surely if I set glossaries to have high priority, this should not be dependent on entry length. I suspect I'm missing something here.

On point 2. Where is this option? I've had a look through the preferences tabs. The Memory tab has lots of options, but it's entirely unclear what they do. Is this documented anywhere??

Many thanks,
Jeremy

 

Amos: why is 'longer entries prevail' not a CT problem


I should have been more clear. It's an AA problem dating back to DejaVu late last century.


 Where is this option?


Subsegment to virtual threshold.


H.

I've increased the 'subsegment to virtual threshold' to 1000 and I still have this problem - it would be helpful (Igor!)if there was some explanation of these settings!!

 


1 person likes this

Amos: it would be helpful (Igor!)if there was some explanation of these settings!!


Hear! Hear! How to do things can be helpful, it's the why to do things that's essential.


H


Quick follow-up on the above - restarting CafeTran does remove some (possibly all) of the offending virtual matches. This doesn't affect the fact that:
1. Glossaries should be prioritised in accordance with the settings chosen.
2. Changing the 'subsegment to virtual threshold' setting should either cause CafeTran to recalculate these matches or it should warn you that you need to restart to reset these matches.

 

1. Glossaries should be prioritised in accordance with the settings chosen.


They are prioritized for the same entry only. As Hans vdB mentioned above, the Autoassembling engine analyses the segment in such a way that a longer phrase match including the shorter ones is picked for autotranslation. Although not perfect, that is the most natural and efficient approach because when the program finds a long phrase match (a long subsegment match), it does not have to look back for time consuming further analysis. It just treats the phrase as a whole and moves on to find the next match.


Igor    

IK:  Although not perfect, that is the most natural and efficient approach because when the program finds a long phrase match (a long subsegment match), it does not have to look back for time consuming further analysis. It just treats the phrase as a whole and moves on to find the next match.


It's not perfect, but it does make sense. In most cases, the longer match is the better match. I think I solved the problem by adding those wrong subsegment hits (fragments, words) to the ProjectTM with the correct translation. Next segment where it occurs should be OK, and I use the ProjectTM also for consitency QA.


H.

Adding terms to the TM to get around bad virtual matches is a good suggestion Hans, I will try that. Thanks!!

 

THIS! I have the same issue. I have a specific project memory to which I attributed High priority and still machine translation (MyMemory) and "assembled" always take precedence over a fuzzy segment match from the TM.

For instance, here is an attached file where I have a "19% assembled" suggestion and a 82% fuzzy match TM hit. Well, CafeTran automatically populated my target segment with the "19% assembled" translation instead of the "82% TM match" (that comes from a TM created and controlled by me).


By the way, my memory is a segment AND fragment memory, so I don't think that's the problem. Any ideas, after this was last discussed 2 years ago?

Hello,


If I understand correctly, the issue you describe is not the same as the one discussed above.


The OP had an issue with the results offered by the Auto-assembling feature (which use segments, fragments, glossary entries, and possibly MT).

You seem to have auto-assembling and MT results take precedence over relatively high TM matches.


Can you check the settings you have in Preferences > Memory ? If needed, share a screenshot here.


These settings are explained here: https://github.com/idimitriadis0/TheCafeTranFiles/wiki/1-Preferences#memory


In particular, you might want to raise the Auto-assembling insert threshold and lower the Fuzzy match insert threshold.


You can also check the Preference > Auto-assembling preferences and review the first settings in Preferences > MT services (Team auto-assembling with machine translation and Team high-priority fragments only). All explained in the same document.


1 person likes this

Thanks Jean, you always give great input!


Here is a screenshot of my parameters (comparing with the link you gave, it seems like I have the basic settings). What would you suggest? Should I just tinker with these values until I find the best optimisation or have you found values that kind of work well for all projects?


Thanks!

I currently don't use Auto-assembling much (it does not work well for me in French for most content types I translate), so I have set its threshold from 70 to 90. It never bothers me.


I have also lowered the TM match threshold to 80, so your 82% TM match should have been inserted automatically.


You can always quickly press F1 (to bring up the Auto-assembling pane, which also shows high TM matches you can quickly insert) to compare the Project source and TM source segments and spot the differences.


If MT content is still being inserted over TM matches, you might want to disable Preferences > MT services > Team auto-assembling with machine translation, or perhaps keep it enabled and also enable Team high-priority fragments only).


Cheers!

Weirdly, I have changed by settings quite a bit (see image) and I still get automatic insertion of a 27% assembled segment rather than an 89% funny TM match. Igor?

Login to post a comment