Start a new topic

Glossary messed up (and I do not know why)

Hi,


I imported my main glossary from memoQ (only the terms, nothing lese).


Before using it in memoQ, I used massive stemming (replace "\t" with ”|\t") as it seems to have more advantages than inconvenience in my case. Then I copied the glossary and inserted it into my CT glossary. I optimized it (remove duplicates, remove source=target), and now I have a kind of fuzziness I do not understand


For "perform" it shows up the term for "form". For "recode" it shows up the tern for "code".


There are no regex terms in this glossary.


Any hints?


Hi Torsten,


CafeTran also uses pipes for stemming. For example:


perform|ance matches perform, performance,  

re|code|  matches both recode and code


so perhaps your memoQ glossary is compatible in this regard.


Igor


Uh, sorry, my mistake, I should have written "Before using it in CT" instead of "Before using it in memoQ".


I now have the CT glossary with this unintended massive stemming. It looks like this


source term| <tab>target term


And it has hits as described, but I do not want them. What can I do?

Hi Torsten,


Just edit your text glossary in a text editor and  search & replace all | characters with nothing to remove them.


Igor

Hi Torsten!

I experience the same with my glossaries (e.g. CT identifies the glossary entry "haften" when my text contains the word "Gesellschaften").

When finding glossary matches. CT seems to ignore the beginning of words, regardless of the stemming you use.

Cheers,
Martin

 

Okay, I removed all pipe characters and the problem has been solved. Now the issues have gone and I add pipes only in certain cases (but even then the problem of "too much fuzziness" is not there. I only can assume that – when searching and replacing, see above – there has been introduced a pipe character somewhere in the glossary where it shouldn't be.

Login to post a comment