Should characters defined in Preferences > Memory > Do not match be displayed for TM matches with source text comparison in the TM panes, the Auto-assembling panel (F1) and the Matchboard (with Show fuzzy source segments enabled)?
I really think they should.
Reasoning: when TM matches differ in characters that are not included in the "Do not match" field, CTE tends to lower the match percentage drastically.
To get around this, I tend to keep various punctuation or white-space characters in the "Do not match" list. The list is also populated by default, so most users have "Do not match" characters set.
The advantage, is that CTE does not lower the TM match percentage for these characters, and more TM matches are being catched.
However, this hurts readability when reviewing the TM matches (in the TM pane, the Auto-assembling panel or the Matchboard).
Indeed, the source text comparison is stripped from all "Do not match" characters, making it harder to read the source text comparison. Only when searching the TM match manually, can you actually view the match source text in full. I have had to resort to this manual search at times, because I could not quickly make sense of the source text differences, so as to quickly understand what part of the TM match I may wish to reuse.
Ideally, while CafeTran still applies the "Do not match characters" for matching purposes, it should still display the source text comparison with these characters preserved. They could get a background color, though, to highlight the fact they are not taken into account.
Another solution would be to greatly reduce the impact punctuation characters and white-space characters have on TM matches. If only tags, punctuation or white-space characters are present, CTE could limit the lowering by 1%, so that we get 99% matches.
What do you think?
Anyone else who would like to see this change for TM comparison?
I can't help but notice that glossaries now tend to take most of the favours :-)