Start a new topic

Glossary options

Hi everyone!

Here's a quick question about glossary options:

What is the difference between the "Prefix matching" and "Look up word stems" options on the Glossary tab of the Preferences?

Thanks for your help,
Martin

 


Hello Martin,


This is the kind of information that you can find in the reference documents


In the Preferences reference document:


Preferences > Glossary > Look up word stemshttps://github.com/idimitriadis0/TheCafeTranFiles/wiki/1-Preferences#look-up-word-stems


If enabled, CafeTran use word stemming when querying for glossary matches. CafeTran provides stemming based on the Hunspell dictionary.


Preferences > Glossary > Prefix matching: https://github.com/idimitriadis0/TheCafeTranFiles/wiki/1-Preferences#prefix-matching


If this option is enabled, it introduces automatic fuzziness to your glossary matches.


Note: As the “Prefix length” option is shared with TM fragments, you can adjust the prefix length in the options for the memory (e.g. after right clicking at the memory pane). You will need to reload the glossary after turning on/off the Prefix matching option.  See the TM options reference document for more information on the prefix length.


Here's what the TM options reference document says for the Prefix length option.


Prefix matching (%) = Checkbox and drop-down menu. Choices: Fixed length, 10, 20, 30, 40, 50, 60, 70, 80, 90. Default: Fixed length.


When this option is selected, CafeTran will analyze the beginnings of words (here called prefixes) and discard any endings responsible for inflection of words. It is an option which increases significantly the number of hits for highly inflected languages. The length of prefixes is set by a percentage number. The bigger the percent number the longer the prefix of words which the program will analyze. The length can also be fixed, when the “fixed” option selected, instead of a set percentage length. It means that all the words will have the minimal prefix length, no matter their actual length.


Note: As the "Prefix length" option for Glossaries (Preferences > Glossary > Prefix matching) is shared with TM fragments, for glossaries, you can adjust the prefix length in the options for the memory.


A note on Custom prefixes: If the inflection of a word is too high for automatic prefix matching you can enter your terms to the memory determining the prefix of a word manually. This is done by inserting the pipe character | at the end of a prefix in a word. For example, the Polish phrase "piękny dzień" (a beautiful day) has a highly inflected word "dzień" occuring in a number of various cases (dnia, dni, dniom). If you insert the pipe characters at the following positions - "pięk|ny d|zień", CafeTran will also recognize other forms of the phrase (pięknego dnia, pięknych dni etc.). Note that inserting the pipe character at the first word in the phrase - "pięk|ny" is op-tional since its inflection is quite regular and CafeTran should recognize its prefix automatically.


Hope it helps!

Thanks a lot for the detailed info and the link to the reference documents. I only searched the knowledgebase and the forum before posting my question.


One question remains: If there's more than one TM open (the ProjectTM and one or more reference TMs), which TM settings will be used? The ones for the ProjectTM?


Cheers,

Martin

That's a good question, Martin, to which I do not have an answer, at least not without testing. My guess is that it takes the settings from the TM which has the "Fragments memory" option enabled (if more than one have that on, what then?), but it could be just so that it uses the ProjectTM to pickup the setting.


I hope Igor can provide us with more insight here. I'll happily update the information already available on that point in the reference documents.


Cheers,

Jean

> you can adjust the prefix length in the options for the memory.

> which TM settings will be used


The prefix length option is shared (global). You can set or change it via the options panel of any translation memory.  

Login to post a comment