Simple detail question:
How can I make CT recognize this term?
Please note that there are different occurrences
Note the difference in the apostrophes (the 3rd case is quite seldom). It depends on programs and some other aspects, which apostrophe is being used.
If I understand correctly, I have the following options
Perhaps I oversaw something?
The handling of terms with apostrophes concerns users translating from Fench, Italian, Catalan and many more (that I might ignore here). I do not think this is an exotic problem.
You mean L’AMERIQUE in caps? Hm, that's a bugger. This would require contributing to the official Hunspell French dictionary. For a specific CafeTran installation, I guess it should work if we add AMERIQUE in the user's spelling dictionary. Right?
For obligatory client glossaries, I'm with you that it would be great if all required terms could be properly recognized.
However, it might be useful to note CafeTran already offers a great way to check these terms: QA > Word lists.
As a test, I have simply added "Amérique du Sud" in a text file and used it with "Check source segments for words". The QA check filtered both segments from your text file, so this seems like a viable option/safety net, and just a matter of pasting all source glossary terms to a text file.
> For a specific CafeTran installation, I guess it should work if we add AMERIQUE in the user's spelling dictionary. Right?
Unsure. Amerique is already in the glossary, As I am translating only from and not to French, I do not have a user spelling dictionary for FR (and would not know how to feed it). And this case would only apply to those who do not have a user-specific dictionary in French (as French is their SL, not their TL).
> As a test, I have simply added "Amérique du Sud" in a text file and used it with "Check source segments for words".
This would mean to get a bunch of segments where terms are recognized or perhaps not. Let's take a segment of 50 words with 6 terms (and one with an apostroph). I will see the six terms, but not the seventh. A workaround, but a quite complicated and error-prone workaround, especially with big glossaries.
This is the way OmegaT handles the thing.
Recognition works with both main apostrophes, only "autres" is not recognized. In further tests, "d'Amérique" and "l'amérique" in lower case have been recognized.
OmegaT vs. CafeTran 3:0 (well, not in most other aspects, of course).
Tre: This would mean to get a bunch of segments where terms are recognized or perhaps not. Let's take a segment of 50 words with 6 terms (and one with an apostroph). I will see the six terms, but not the seventh.
In my test, both "l'Amérique du Sud" and "l’Amérique du Sud" were recognized in this QA check, so I think it should work for all exact term occurrences.
A specific language User's dictionary can be edited in Edit > Edit user's spelling dictionary, if you load a project that uses that target language. You can install the French hunspell dictiionary even if you don't translate to that language.
That's all clear. I am already not convinced of the approach that only single words out of Hunspell are being recognized after an apostrophe in CT, so now create an extra project (FR Hunspell is installed anyway) to include terms or term variants that are not in Hunspell? Seriously? Only to recognize a term behind an apostrophe? This might be okay for us who love fiddling around with files and playing with text editors, but for most users not. And up to now this all is not even documented (perhaps the process of documentation might reveal how cumbersome this is).
... while even the simplistic tool OmegaT offers this feature (see above, sorry, this argument is rather ugly).
> this argument is rather ugly
Yes, it is but I don't really care as such arguments are the least convincing and the first to ignore completely.
Anyway, I think an option in Preferences to ignore (not match) user-defined morphemes might be added in the near future to cover all similar cases. Instead of hard-coding them, the user of a specific language will be able to define those few prefixes.
Indeed there are cases where an apostrophe ends up in a glossary.
It would be nice if the solution outlined above would also cover these cases (term with curly apostrophe is also shown when there is a straight apostroph in the source), without being forced to use RegEx.