Start a new topic

Long glossary entries and 'Display longest match only' option

Hi all,

I'm a bit onfused with the way the glossary lookup works for long glossary entries.

I have a (client) glossary with lots of long entries.

CT fails to find these, because there is also a short glossary entry for the first half of many of these.

Glossary contains:

Institute of Molecular Biology


Where the text contains "Institute of Molecular Biology", CT is only finding the glossary entry 'Institute'.

The 'Display longest match only' option is ticked.

Is this a bug or a feature? If a feature, how can I configure it to get the long glossary entry?

Many thanks for any assistance,

amos: Is this a bug or a feature?

Neither, I think. CT searches from left to right, and then finds "Institute" first. Then stops. Now if only you'd have used a TMX termbase(s)...


It's all very strange. Perhaps:

  1. CT starts looking for "Institute of Molecular Biology"
  2. It finds "Institute". That's perfect. There's no longer entry (in that glossary)
  3. So it starts looking for "of Molecular Biology"
  4. It won't find that, because in the client's glossary, it says "Institute of Molecular Biology".
  5. Result, CT inserts the translation for "Institute"
But it gets stranger (if my memory serves me well, and I bet it does this once, because you won't believe it). If you disable "Display longest match only" (yes, disable), CT will still AA "Institute", but will display "Institute of Molecular Biology" in the MatchBar as the first result for that subsegment.

TMX termbases.


OMG, that works! Thanks Woorden!

So in fact the "Display longest match only" option means exactly the opposite of what it says.

Only in CafeTran!!


amos: OMG, that works!

What works is TMX termbases. Nothing else.


Glossaries should find the match for "Institute" and "Institute of Molecular Biology".  With the "Display longest match only" option on, it would show only the longer match of the two.The only exception is where "Institute" is a part of another (earlier) match within this segment which should be shown in the Matchboard.

IK: would show only the longer match of the two

But would it blend? I mean, would it auto-assemble?


Yes, the longer of the two should be selected for auto-assembling.

Further investigation shows that my above statement is wrong and the problem in this particular case is not what I thought.
The problem is actually that the source text says "Institute" (institutes), not "Institut" (Institut), "Institute" is erroneously in the (client) glossary as "institute", and there's no fuzziness. So for once, Woorden is right that TMX's (where I believe fuzziness has been implemented) would solve this particularl problem.


Amos: ...that TMX's (where I believe fuzziness has been implemented) would solve this particularl problem.

I'm pretty sure the whole problem wouldn't exist if you'd have used TMX termbases. Now if you're not in a hurry, could you please try?


Hi Hans, I'm almost certain you're right, but for me at present ease of editing outweighs the advantages of TMXs, particularly as I find glossaries only marginally helpful for the types of texts I translate, so that the overhead of converting to TMXs would not repay itself.


amos: present ease of editing outweighs the advantages of TMXs

And even that I contest.

H. (going for the CafeTran Super Fart Cup, and still have a lot of catching up to do)

Since the postings of Lenting/CafeTran Training in this thread have been deleted (again), and because this can lead to some misunderstanding, I re-posted them in the correct time order and context:





As you can see, a first name can be confusing, the first Hans mentioned refers to me, the other refer to Lenting.


1 person likes this
I'm a little surprised by your posting, Hans. I deleted my contribution, because Jeremy didn't find it on topic (and since he"s the "topic owner", I wanted to follow his request). BTW: It would have shortened the thread even more if Jeremy would have removed his request to me refraining from answering too. Perhaps the both of you can clean this thread, so that it's on topic again? On a side note: Later i remembered that I've recently compared TMX and glossaries on "conjugation-spanning multi-word terms". Both didn't recognise conjugated forms, unless pipes where inserted.
Login to post a comment