I have a question about what can be called "partial" glossary matches.
Partial match Type 1
When a glossary contains the term "all," then "all" as part of the term "usually" is highlighted.
Partial match Type 2
When a glossary contains the term "merchandise item;merchandise items" (two alternative entries), then, for a source sentence like "a number of merchandise items," both entries (singular and plural) are separately displayed on the glossary pane.
My question here is, is it possible to control this (e.g., elect not to display these partial matches)? It seems prefix matching has something to do with this, but I'm not sure.
You're welcome! By the way, for those users who wish to try out another matching algorithm (Lucene engine based), which can find all forms of a word, CafeTran offers the integration with TM-Town web service. After uploading your glossary there, CT will show you the matches automatically in its interface.
What about the semi-colon characters and pipe characters in the glossary (both at the source and target sides)?
If I upload my glossaries to TM-Town, apart from the Lucene thing, will it work the same way as CT (e.g. giving/displaying matches, auto-assembling, regex.. etc.)?
Does this mean that you'll be providing new features primarily via the TM-Town platform, requiring an on line connection? What is your path of development?
Lucene search engine implementation is a feature provided by TM-Town and CafeTran makes use of it via the available APIs the same way it uses the available APIs provided by Google Translate and Microsoft Translator MT services. CafeTran's path of development is independent but when there are available APIs which enhance functionality of the program, it is naturally enjoyable and practical to implement the connection between the tools. TM-Town gives translators very nice, cloud-based search and management tools for TMs and glossaries. CafeTran has always given choice for translators. It does not force you to use Google Translate, MS Translator or TM-Town services. You can use them in your work flow only if you find them useful.
Why would you add singular and plural at the left-hand side? Because you want to be able to swap your glossary (use it in the other direction)?
Are you suggesting that I don't have to add the plural unless the term is inflectional like "companies" for "company"?
Well, if you only want to use your glossary from left to right, I'd personally create separate entries for singular and plural, since this controls my target language too (there is correspondence in sing/plur for my S and T).
How about your language combination?
Would you please give me a brief on how the glossary matching function is designed to work?
Those which I call "partial matches" occur in some cases (mostly for shorter terms), and not in other cases (mostly for longer terms/phrases) ... well, I'm at a loss what to do to get the results I want.
I use glossaries in both ways (English/Japanese).
If singular and plural English terms should be assigned different Japanese terms, I create separate entries, just as you'd do.
Well, to put it short, CT can identify, say, "control" in "controls," so it is not necessary to include plural unless there is a special reason to do so (inflection, different meaning ...)?
Can you provide a screenshot of such a partial match? If you source language has a word separator and your tab delimited glossary is not set as a regular expressions glossary, you should see only exact matches for the source terms. The partial matches are controlled via regular expressions. If you have source side synonyms in your glossary, the one appearing in the source segment should be displayed.
Here is an example.
"device" in "devices" is recogized.
This glossary is not a regex one. No other resources are used.
Actually, I don't mind if this type of match occurs; but, what's really annoying me is that this occurs only in some cases (maybe, only for some terms). So, I want to know what makes it happen.
The match is actually only for the "device" but CT highlights all the occurrences of the match in the current source segment including in the plural form.
So, I should be expecting matches for a certain string of characters, rather than for a whole word in its ordinary sense.