Glossary messed up (and I do not know why)


I imported my main glossary from memoQ (only the terms, nothing lese).

Before using it in memoQ, I used massive stemming (replace "\t" with ”|\t") as it seems to have more advantages than inconvenience in my case. Then I copied the glossary and inserted it into my CT glossary. I optimized it (remove duplicates, remove source=target), and now I have a kind of fuzziness I do not understand

For "perform" it shows up the term for "form". For "recode" it shows up the tern for "code".

There are no regex terms in this glossary.

Any hints?

Hi Torsten,

CafeTran also uses pipes for stemming. For example:

perform|ance matches perform, performance,  

re|code|  matches both recode and code

so perhaps your memoQ glossary is compatible in this regard.


Uh, sorry, my mistake, I should have written "Before using it in CT" instead of "Before using it in memoQ".

I now have the CT glossary with this unintended massive stemming. It looks like this

source term| <tab>target term

And it has hits as described, but I do not want them. What can I do?

Hi Torsten,

Just edit your text glossary in a text editor and  search & replace all | characters with nothing to remove them.


Hi Torsten!

I experience the same with my glossaries (e.g. CT identifies the glossary entry "haften" when my text contains the word "Gesellschaften").

When finding glossary matches. CT seems to ignore the beginning of words, regardless of the stemming you use.



Okay, I removed all pipe characters and the problem has been solved. Now the issues have gone and I add pipes only in certain cases (but even then the problem of "too much fuzziness" is not there. I only can assume that – when searching and replacing, see above – there has been introduced a pipe character somewhere in the glossary where it shouldn't be.

