Start a new topic

TM/Fragments

Hello,

After using CafeTran for several months, I am still unsure about the best approach to TM usage. 

I have a rather disorganized "big mama" TM from my previous translation software. I would like to use this full/partial matches of segments and also for concordance searches. However, I don't want to use fragment matches which are just clogging up my Matchboard with useless matches (these are very inaccurate for character-based languages).

Ideally, this TM would run as unobtrusively as possible (perhaps using prematching).

Could someone suggest the best way to achieve this. Would using Total Recall work best? And how to I stop fragment matching?

Thanks!

M


>And how to I stop fragment matching?


I don't like that either. This works for me: when opening a memory, uncheck Fragments memory. In the Priority selector, select Keep out of auto-assembling.

Thanks, that has stopped the fragment matching.


Any suggestions about handling my large TM?

Disable automatic matching? BTW Did you already remove all duplicates, TUs where source is identical with target, that contain only numbers etc? See the Task menu.

In the TM options of your Big Mama TM (right-click inside the TM window > Options), choose the most appropriate Matching type.


Fuzzy & hits = With this option, CafeTran analyzes source segments on a word basis (fuzzy matching), and performs statistical analysis of subsegments to determine their translation (fragment matches aka Hits). See an explanation of Fragment Hits. | Subsegment matching can be tweaked in Preferences > Memory.

Fuzzy = With this option, CafeTran only performs fuzzy matching analysis. Disabling subsegment analysis speeds up the matching process.

Fuzzy without word separator= With this option, CafeTran analyzes source segments on a character basis, which is suitable for languages without a word boundary (e.g. Chinese or Japanese).


Depending on your languages, you probably want to test Fuzzy or Fuzzy without word separator. This will disable Hits (see link for explanation).

Thanks for the very helpful suggestions!


How about using Total Recall? I tried it before but it takes a long time to load each time I start up CafeTran.

Opinions differ on this topic (TR).


Personally I don't use it at all.


I'd say: it all depends on the specs of your computer. If it is fast en enough and does have enough RAM, you might not need TR.


Also consider: currently many useful MT engines are available. Which have pros and cons. For me, the fact that e.g. DeepL has indexes many manuals (my field of translation), makes me actually learn from other persons ideas, every day. So for my MT is useful. (Critically used, and often laughing about funny suggestions. They sometimes brighten my days.)


Back to what I wanted to say: you might want to investigate whether small (e.g. 200K TUs) TMs in combination with 3 or 4 MT engines are a better / nicer / more useful approach than one BM/one BM in TR.

Using Total Recall is a valid approach, but unless you intend to open and query the database itself (manual concordance only, once you open a database table from Total Recall menu), what it does is create an TMX translation memory out of the TUs stored in the database, depending on the recall settings.


Total Recall settings are almost identical to TM settings, so you can apply the most relevant TM settings to Total Recall as well. 


https://github.com/idimitriadis0/TheCafeTranFiles/wiki/3-TM-options#total-recall-options

Thanks for the suggestions, much appreciated!

It seems my "big mama" TM accumulated over 8 years of translation is actually considered "small" (only 170k TUs), so I will keep using that as a TM for now, and try out the settings suggested by Jean.

Hi Mark,


Coincidence has it that I just received a 220 K TUs (segments) SDL WorldServer TM from a new client.


After removing all duplicates. segments without numbers and segments where source=target, only 147 K TUs remained.



I have also been wondering how to prevent TM fragments from being shown, by Total Recall in my case. Thank you for suggesting to select "keep out of auto-assembling", this worked for me, too.


But shouldn't it be enough to just deselect "Fragments memory" to stop CT from showing fragments in the match board? If I only select "Segment memory" in the Total Recall options in the dashboard, with "Fragment Memory" deselected, CT was still showing the fragments and, worse yet, prompting them. Not knowing how to prevent this even though only "Segment Memory" was selected, was driving me nuts.


One more question: How to make CT remember the choice of "keep out of auto-assembling" for a TM or Total Recall? I even saved a project template with this setting, but when I quit a project and open it again, or create a new project, "medium priority" is selected by default and I have to manually select "keep out of auto-assembling". It would be great if CT could remember this setting.


Thank you in advance for your help!

You can also select "Fuzzy" only in Matching type field to limit the number of the fragment matches to only those fragment pairs which are in your memory, without the program looking for the virtual hits. "Keep out of auto-assembling option", if selected, will be kept permanent in the next update. Thanks!


1 person likes this

I am looking forward to being able to select "Keep out of auto-assembling" permanently both for glossaries and TMs. Please also make the other options permanently selectable, such as "Read only". Thanks! :)

Login to post a comment