Start a new topic

Autocompletion, hyphenated words and segments with a fuzzy hit

 Hello all!

I've been using Cafetran with some longer projects for a week or so and am really enjoying it.

However, I still have plenty to learn and really a list of question the length of my arm.

Some things I've definitely missed from the documentation etc., but there are a few questions I wonder if someone here can answer.

I've been using Cafetran on Ubuntu, if that is at all relevant

1) Autocompletion... great, but it doesn't seem to work e.g. when editing a segment _unless_ there is a space after where you are typing.


Source sentence: "This is a source sentence"

Target sentence "This is a target sentence"

If you double-click "source " (i.e. also selecting the space at the end) and start typing "tar...", autocompletion doesn't start for me - even after three characters - unless I put a space in after it. Can be frustrating when I know it should autocomplete

2) Hyphens

I work from German and German writers just _love_ a hyphen. Cafetran seems to treat words separated by a hyphen slightly differently than e.g. memoQ.

For example, for the string "one-hyphen", if I go to the end of the string and press Ctrl+Left e.g. in this post window or in memoQ or in Word, it takes me to the end of the hyphen, not the beginning of the string. In Cafetran, it takes me to the beginning of the string. Why is that? Trying to select some text in German source segments is quite frustrating as a result.

3) Fuzzy matches are also great...

... but I can't work out how to stop Cafetran treating fuzzy match segments as translated. As soon as a fuzzy match is inserted, Cafetran skips the segment (when I have "Skip translated segments" selected).

So, I've just finished one segment and press Alt+Down. A fuzzy match is inserted in that segment, then I think "Oh! I missed something in the previous segment" and click on it again, edit it, then press Alt+Down again, but now it _skips_ the segment with a fuzzy match, even though I never pressed Alt+Down in the second segment. Am I doing something wrong?

4) On a similar theme, how do you get Cafetran to just go back to whatever segment you were working on before the segment you are currently in?

It may be lazy on my part, but I constantly use Ctrl+Z e.g. in memoQ as a quick way to skip back to the last segment if I realise I've missed something just as I finish the segment. If I've got e.g. "Skip translated segments" selected and Cafetran automatically skips a lot of segments before the next untranslated, I can't see how you can easily get back to segment you were just working on.

Just a few things then!

Any help or suggestions would be much appreciated.


Thanks for the reply Igor.
For 1), your suggestion works unless I need an export of work-in-progress that I can open in Studio. I don't agree that it's logical for CT then simply to 'forget' what segments are 'Checked' if the segment status logic in CT relies on the difference between Checked and Translated....... It seems that CT defaults back to the segment status of the Studio files, but doesn't function in the same way - surely illogical.
For 2), this was set to 50000 for me - I don't think I had changed it. Is that the default? There is definitely substantial delay on my system (32GB RAM, i5 6000K processor) when I open large TMs/Total Recall. It seems that CT also does not save the preliminary matches to disk, so it repeats every time you open a project. Is that right?


2) The TM containing the preliminary matches is a temporary one. So, you need to export it before closing the project if you want to use it later (Memory > Export).


Two additional comments on Point 2:

1. As far as I remember (though vaguely), the default value for automatic preliminary matching is 25,000.

2. How much RAM is assigned to CafeTran? I think the default value (1,024 (?)) is too low. Check Preferences > Memory > Java memory size. My rule of thumb on Windows is 6,000 (6GB) as a minimum.

3. If you are on Windows, please also make sure that you are using Java 64-bit version. The 32-bit version is very slow, especially when large resources are loaded.

4. If CT is still too slow, you may want to raise the fuzzy match threshold from the default 33% and/or import "reference-only" TMs as read-only or as read-only + manual matching type (i.e., TM remains dormant unless you do some action on it, such as a fragment search (ALT + enter) or TM search).


Thanks masato. I'll try these steps and see if the speed improves.


Assigning more RAM (6GB) to Java has sped up CT considerably.
Of course, it takes a lot of RAM, but not a problem on my system.
Hard to know what the 'optimal' amount of RAM would be


The developer encourages the users to assign as much RAM as possible. The more, the better. I guess heavy users assign 8 - 10GB of RAM to CT.

I have a similar example as "2) Hyphens" in the original post.
It seems Java has a strong preference for selecting 'whole' words/strings.

For the German work "Fadenabzugsgeschwindigkeit", how do you select _just_ "Fadenabzug" or even just "abzug" *with the mouse* - i.e not using Shift+Arrows?
I've been doing this for years with the mouse in other programs without thinking about it, but in CT it snaps to the whole string as soon as I've selected it.
This is obviously more of an issue for compounding/agglutinating languages.


Preferences > Workflow > Automatic selection of whole words

@tre Thanks!


Login to post a comment