Start a new topic

Term recognition and AA: Strange behaviour

Following situation:

Client TM and Package TM: Medium priority

Total Recall TM: Low priority

Glossary: High priority


Source segment:

image

The whole segment is in the TB:

image

In this case with both accents:

image

Client TM and Package TM have only one result for the word "DE". Total Recall TM has several results.


Indeed I would expect that AA gives out "Bilanz des vergangenen Jahres".


But first irritating thing: this is the result when pressing F1:


image


The second irritating thing is that there is a term recognition (see the glossary image), but it is not reflected in the source segment's colors (yes, reload glossary and even CT restart done).


Partial solution: When unchecking the Total recall TM (or simply keeping it out of AA), indeed AA gives out:


image


My questions:

  • Would it makr sense to keep low priority TMs/TBs automatically out of AA (an option to be checked somewhere in the Preferences)? Or in other words: How much value does it have to have low priority TMs inside AA by default?
  • Is it possible to make the term recognition reflect also here in the source segment's colors (just as expected)?
  • Would it be possible to have a more exact mach rate? Here it is two terms plus a space = 75%? Indeed it cannot be 100 %, but perhaps there are some small cogs to be adjusted...



Just keep your low quality resources out of auto-assembling - there is a right-click option both for TMs and glossaries. If you feed low quality resources into AA engine, you may get low quality results.

You should treat the percentage result for AA as an approximation. CT has no notion of meaning. For example, two assembled fragments can give the perfect result - then you may ask why it is not 100%, right? - other times it can be very low in real quality. It all depends on your resources.

Having said the above, the algorithms keep evolving so perhaps one day, CT will "understand" which assembled fragment is more meaningful to pick.   

 

Indeed, 75 % is pretty fair as match value. But it covers 23 of 24 charcters (around 95 %) of the source segment (there is one space being added), so I just wondered.


Would it make sense to have this "out of AA" option as a permanent setting (such as the case with Read-Only)? As fas as I see it isn't.


Just for curiosity: Would a glossary in TMX format bring better matches?


And there remains the problem of non-recognition in the source segment editor (with green color), even not after ruling Total Recall out.

 > Just for curiosity: Would a glossary in TMX format bring better matches?


I don't think so.


> And there remains the problem of non-recognition in the source segment editor (with green color), even not after ruling Total Recall out.


That would bring about a much bigger problem to differentiate between glossary match colors, TM match colors and non-translatable colors. The user would be lost in the colorful patches possibly canceling each other out in the source segment editor.

Login to post a comment