Start a new topic

Term recognition (reloaded)

Simple detail question:

How can I make CT recognize this term?

image


Please note that there are different occurrences

  • l’Amérique du Sud
  • l'Amérique du Sud
  • l`Amérique du Sud

Note the difference in the apostrophes (the 3rd case is quite seldom). It depends on programs and some other aspects, which apostrophe is being used.


If I understand correctly, I have the following options

  • Prefix matching: this means to get many false matches, depending on the glossary. This would be as a kind of bad case okay, but if I see it correctly, this does not work for multi word matches, such as "Amérique du Sud".
  • Pipe character at the start of the word. Really? For any French word starting with a vocal?
  • Enter the term with the "l'" (and the two or three apostrophe flavors). Not seriously?


Perhaps I oversaw something?


The handling of terms with apostrophes concerns users translating from Fench, Italian, Catalan and many more (that I might ignore here). I do not think this is an exotic problem.



The „various glossary options for matching“ were after this success not so many.
  • Look up word stems (kept activated after the first success)
  • insert U+002D into "additional space characters" (respectively delete it from there)
  • insert apostroph (this one: ’) into DNM (respectively delete it from there)
  • Prefix matching can be excluded here, as it gave many unwanted results

I will test this thoroughly in the next days.

Between your trials you seemed to have played with various glossary options for matching as well as with the "Do not match" list. Perhaps restoring them to the point where it worked for you might help.  

> It is really confusing. Once you claim that "Only Amérique as one-word term" is recognized, a few posts later you say it isn't.

I can only agree (the display was stopped after re-opening on the same machine one day later, same project, same glossary, same TM).

From a pragmatic point of view: The issue is that now it is not recognized (any more), and we both (or at least I) do not know why. Surely no kind of Marian apparition.

Perhaps Jean Dimitriadis (idlm) has an idea (if he works from French to another language).

It is really confusing. Once you claim that "Only Amérique as one-word term" is recognized, a few posts later you say it isn't.

Only Amérique as one-word term (with having a term entry of its own, of course). That was hereNun (posting starting with "Okay, these are my next results:"). The setting should have been the same, though I see now that inserting the apostrophe into the DNM field does not change anything. Perhaps this was without restart (can CT then have a kind of interim state?). Porca madonna, as the Italian would say.

Maybe it would help to see if and how Amérique or Europe is recognized somehow after apostrophes.

So what was recognized in your 'temporary success" above?

… and same for Europe => Europa when after an apostrophe (just in case Amérique is claimed to be too exotic or not included in Hunspell).


1 person likes this
… and BTW, Amérique => Amerika isn't recognised here either after an apostrophe.

 

As a really very, very provisional workaround, okay.


But then again it turns glossary work into a real pain (see here and look for acheter). I still did not test to convert the glossary into TMX - would this help?

For multiword terms, like in your example, just add the article to your term to match the same in the source segment. 

Amérique du Sud => Südamerika

 

What's the glossary word you are trying to match, and to what word in the source segment?

  • Do not match content: ,.。:;!¡?¿[]{}()"«»‘’“”„‚ (see also here, BTW)
  • Prefix matching deactivated (it is always)
  • Look up word stems activated (and French Hunspell installed)
  • CT restarted


Nope.

1. Remove that apostrophe from "Do not match" list.

2. Deactivate "Prefix matching" and activate "Look up word stems" for glossaries. You may have too much fuzziness with the two options on at the same time.

3. Restart CafeTran.


"Do not match option" CafeTran removes listed characters from the matched segments to increase the chance of finding. However, the "Look up word stems" needs the apostrophe to determine the stem of the word in this case. 

And clicking on the number of "lAmérique du Sud" gives zero results (maybe it would have found "l'Amérique du Sud"), this should not be the case, no matter how to resolve this.

 

Login to post a comment