Start a new topic

Automatically creating a single project-specific glossary from all glossaries you have

You may want to create one glossary for a particular type of project or document from all your glossaries.

Translators typically use more than one glossary. But, for a particular project, you may not need all entries of all those glossaries. You use some from this one, some from that one ...

Or you may not want to keep many glossaries open to make the interface simple.

If so, convert those glossaries into TMX files (Go to Memory > Import). If your glossaries have synonyms on either or both sides, be sure to unbundle them in advance (see CafeTran Wiki for source-side splitting).

Then, pretranslate the document with these TMX files.

Convert the exported pretranslation.tmx into text file (Go to "Memory > Save as" and save it as .txt).



I am wondering whether I can use this topic to ask my question. I think this method may be one of the solutions for my question below, but I could not do this actually on the Cafetran while this method should theoretically work:)

When I translate one document, for example, 'we' is frequently translated as 'wareware wa' (Japanese), but I want to use 'watashitachi wa'. Then, I input this combination into a glossary by use of Alt + G. Then, this combination can be found in a glossary tab. I select this tab, and Memory - Import - Import glossary, then, I tried Memory - Save all memories, and also, Memory - Save memories. However, translation of 'we' is still 'wareware wa'.

I appreciate any advice to solve this problem.




It looks like your TM has these two alternative translations (warewarewa and watashitachiwa).

When there are two or more alternative translations in your resources, there are a few ways for telling CafeTran to use one of them on a preferential basis.

1. If the alternative translations are in separate resources (TMs/glossaries), give a higher priority to the one that contains your preferred translation (right-click on its tab to change the settings).

2. When an undesired alternative appears in the editor section, select it and right-click. Then you see all the alternatives. When you choose any of them, CafeTran remembers your selection and gives it the highest priority from the next segment onward, despite the priority level of the glossary/TM it is contained in.

Example (English as the target language)


Note that this feature is now automatic. Make sure "Automatic Fragments Adjustment" is checked (Preferences > Auto-assembling).



Thank you Masato for your reply.

I am not sure why, but I could not do both of your methods.

1. When the tab is right clicked, options displayed are: Doc tab to....(six ways), Vertical docking divider, Join tabs, Disjoin tabs and so on, and I could not find an option to modify the priority.

2. When I set Japanese as the target language, the icon bar appear, which is similar to your figure attached, I mean the top section, but any alternative Japanese translation is displayed.

I am wondering this function works differently when the target language is Japanese.

I appreciate further replies.




Point A:
My description was somewhat inaccurate. Please right-click on the pane itself, not on its tab.


Then, you can choose the high priority.

Point 2:

It's working like a charm in my environment even when the target language is Japanese.


If nothing shows up, please change your selection (selected word/phrase) to see what happens. If CafeTran still shows nothing on this panel, it should mean no alternative translation available for that particular selection.



Thank you Masato, I will try these after tomorrow and get back to you.



Hi there

I want to report something about the situation on my CafeTran with anticipation to be able to get more specific advice for my conditions.

I 'import'ed some specific combinations of English and Japanese words including 'we' and 'watashitachi wa'.

In the Matchboard window, the translation is not changed, 'we' is usually translated to 'wareware wa', but in the same window, there are other displays:

(1) A mixture of English and specific Japanese words. These Japanese words are ones I added to 'memory' from Excel file. English texts are with red background, and these Japanese letters are with green background. Only English words included in the imported Excel file are changed to Japanese words, but these Japanese are not reflected in the machine translation in the Matchbox window, which can be transferred by 'Alt + H'.

(2) Table of English and Japanese for the words Japanese is displayed above. 

Alternative word list is not displayed on my CafeTran as Masato shows. Only the tool bar is displayed as I have informed you before.

Thank you for your reply in advance.




Your workflow (a mixture of English and specific Japanese words) seems to be somewhat different from what I usually do. Would you give me some screenshots of your resources (glossaries/TMs) so that I can get an accurate picture of what's bothering you.

For now, I guess the culprit may be redundant white spaces in your resources (source-side entries), which do prevent matching.



Hi Masato

Thanks for your prompt reply. Ok, I attach one screenshot.

I suppose my CafeTran recognise 'we' as 'watashitachi wa' as shown in letters with green background, however, the main translation above does not reflect this.

Thanks for your reply in advance.



(10.2 KB)
Thanks for the screenshot.

It shows that "私たちはワタシ" is one single entry, not two. My guess is that the original Excel file may have these terms on separate lines within the same cell (with a line break inserted).

To get what you want, the resources should be restructured as follows:

(first line is the header)
#en-US   #ja-JP
we   私たちは;ワタシ

TM (one pair in one translation unit)
we   私たちは
we   ワタシ

Please try.

Please also note that CafeTran does not support alternative entries in TMs. So, if you want to use alternatives, make sure that the Excel file has those alternative entries separated only by a semicolon on the same line within the same cell, and then import it into a glossary, not TM.

BTW, if your preferred translation is 私たちは in some projects, and ワタシ in others, you may want to create a separate glossary for the latter that has the reversed order of entries, as follows.

#en-US   #ja-JP
we   ワタシ;私たちは


Thanks, Masato.

The Excel file includes 

we   私たちは

In the screenshot, I suppose, some function of PC, CafeTran or TM seems to extract the reading of the Chinese character 私 and indicate it as ワタシ.

I think what I want to do at the moment may be an easy thing. I want to attribute a particular Japanese translation word to a particular English word.

What I am doing at the moment is:

(1) Make an Excel file of the combinations. Like 'we 私たちは'

(2) Import this Excel file into 'memory' with 'High priority' condition. The default seems to be 'Medium priority' and I set 'High priority' by hand every time. I also sometimes try to leave this setting as 'Medium priority'.

However, 'import of memory' does not seem to be successful, I mean, translations of specific words in the glossary do not changed by this method.

I am afraid that I am doing a wrong method.




I confirmed that CT imports "furigana" (phonetic) data from an Excel file as well.


So, for the benefit of all Japanese users including you, I want to encourage you to request to the developer that these furigana phonetic characters NOT be imported into TMs. He should be able to understand what is being talked about because I requested the same thing when I found furigana was imported from the source Excel file for translation, and he improved the Excel file filter accordingly.

For now, I think there are two workarounds.

1. Remove all furigana characters from the Excel file before importing.
I once created a macro for this, but I don't have it now, I don't remember how. So, please browse the Internet if you prefer this option. Please note that "remove" means "removing," not "hiding." (This site may be useful)

2. Convert the Excel file into a tab-delimited Unicode text and import.
For now, this should be safer. Furigana is discarded during the conversion process, and you only get "we" and "私たちは."
In this case, it's better to save the text file as UTF-8, which is the default encoding for almost all CAT tools including CafeTran.

You can import this bilingual text file either as a glossary or translation memory, as follows.

To import as a glossary:
Add the header line in the text file (please see my previous posting), and add the file as a new glossary (Menu > Glossary > Add glossary).

To import as a translation memory:
1. Open the text file directly as a translation memory (Menu > Translation memory > Open memory). Once opened, it can be saved in the TMX format (Menu > Translation memory > Save memory as).
2. Alternatively, create or open a memory, and import the text file into it (Menu > Translation memory > Import > Import tab delimited memory).

My mistakes.

>> 2. Alternatively, create or open a memory, and import the text file into it (Menu > Translation memory > Import > Import tab delimited memory).

This does not work. Please forget about it. This feature is to open a translation memory created by Wordfast.

>> 1. Open the text file directly as a translation memory (Menu > Translation memory > Open memory). Once opened, it can be saved in the TMX format (Menu > Translation memory > Save memory as).

I thought this would work perfectly, but it does not. You can open it as described, but saving it as tmx does not work.

So, if you prefer a translation memory, add the text file as a glossary, create or open a translation memory, and import the glossary into the TM (Menu > Translation memory > Import glossary).

Sorry for the confusion.



One quick solution.

There is a free TM creating tool called TMbuilder, which can convert Excel files into translation memories. Furigana is discarded. It's worth trying.



Login to post a comment