Start a new topic

Newbie alert! Trying to get started: TMs, number conversion

Coming from WFP, I'm having a hard time with the following (using the demo version):

- I exported my TMs (both big and small ones so the demo works), loaded them in the dashboard. On WFP, for this specific document, the analysis shows 3482 words out of 6111 that are no matches. Using the same TM in CTE, it gives me 4133/6089. I would expect some difference between the apps, but maybe this is too much? (This specific TM is larger than the demo limit, but the app is reading it, not sure it would save to it)

- I can't figure out how to handle numbers easily. I need the dot/comma conversion, and I need to transfer these numbers to the target without having to type them (I work with financial docs, too dangerous to type numbers and make a mistake). I see the numbers highlighted in pink, then I can use the F4 shortcut, but then there is no dot/comma conversion. I found out I can also type three digits (my prompter starts at 2), delete the last digit, and then a suggestion pops up for a number that does not exist in the source:


- Still numbers. I asked CTE to Insert all exact matches, just to see what it would do. It filled the target segments that were 100%, but did not change the numbers:


My current configs that I believe are for numbers:

- Workflow > Replace characters at source transfer: unchecked (have tried checked, no change, even with commas and dots in the correct boxes)

- Prompter: checked - Prompt phrases, Two-word, Auto case adjustment; Prompting starts (2), Minimal word length (3)

- Auto-assembling: checked - Transfer numbers, Format numbers

- CTE has also inserted 100% matches that are not exact matches. In one example, the source in the TM was "Em 2019, foi criado o Comitê de Engajamento, que envolve diferentes áreas, percepções e ideias para assegurar um ambiente de trabalho saudável, com equipes e colaboradores comprometidos", and the source in the new document was "Em 2020, o acompanhamento do engajamento da organização foi realizado através de reuniões do Comitê de Engajamento, que envolve diferentes áreas, percepções e ideias para assegurar um ambiente de trabalho saudável, com equipes e colaboradores comprometidos" (underlined is different text), but the segment was marked as 100% and filled with the wrong translation, and the TM tab was not marking the differences in red.

I think these are the most pressing issues for now :-) TIA

Hi Dekka,

I am not sure if this helps, but there is a setting that you can toggle on/off or tweak that could help.

It is found in Preferences > Workflow > Replace characters at source transfer.

These field pairs allow you to set characters you wish to replace during source transfer.

Note: This replacement option is a helper to "Transfer numbers to matches" feature. It lets your replace the defined characters in a numerical expression during the transfer from the source to the target segment.

Transfer numbers to matches can be found in Preferences>Auto-assembling.

If the option is ON, CafeTran automatically inserts the numbers present in the source segment into the target suggestion, replacing those from fuzzy matches.

Auto-assembling settings are also used when CafeTran handles TM matches.

Hi Igor,

Can you please elaborate on your previous reply on this? I am also interested to know. You said F4 should turn e.g 12.4 into 12,4 if your languages have that respective number formats.

How does CafeTran achieve the decimal and number formats difference based on the project languages?


> How does CafeTran achieve the decimal and number formats difference based on the project languages?

It achieves it by checking the language codes of the project and then translating numbers from the number format of the source language to the target number format. There are helper methods within the programming language itself which may not cover complex cases.

Hi Igor,

Thank you for your answer. I assume this is for when the "Format numbers" is used.

Is there a way to (re)view the rules used per language code?

For example, 1,000,000.00 is usually written as 1 000 000,00 (with spaces or preferrably non-breaking spaces and comma for decimals) in French.

Is CafeTran supposed to doing this conversion when translating from English to French?

The specific locale implementation  for the given language is buried in the Java code. Oracle makes available very general help documents without specifying the implementation details for languages. For example, it looks like the grouping (thousands) separator implemented for French in Java is not normal non-breaking space but the narrow non-breaking space.  Please expect some improvements to handling (converting to and from) the numbers with the the grouping (thousands) separator in the next update.

Hi Igor,

Thank you for providing more information about this.

Indeed, in French printed works, the grouping separator to be used is what we call an "espace insécable fine" (narrow non-breaking space, or U+202F), but otherwise the simple non-breaking space is being expected and used.

In 5 years, I had not one translation project where the narrow non-breaking space was to be used.

This is a good example to highlight the need for the user to be able to view/access and if necessary set/customize common numbering format rules (like decimal and grouping separators).

Plus, not knowing the rules that CafeTran reads, the user does not know what to expect and which types of numbers can be correctly converted into the target language.


But even before considering that, there is a more pressing question.

According to the description of "Format numbers", with this setting enabled, "CafeTran formats numbers to the target language numbering system if it is different from the source language system".

How to experience the numbering format conversion?

With "Transfer numbers to matches" and "Format numbers" enabled in Preferences>Auto-assembling (but not "Replace characters at source transfer", which is I find too crude/sweeping for my needs, plus it does not handle both periods to commas and commas to a non-breaking space or a period), I am unable to experience what these numbering rules do.

There is simply no conversion occurring. 

Numbers stay the same in English to French (with F4, source transfer or fuzzy match/auto-assembling), but also from English to Portuguese and vice versa, for example. I think the issue is general and not related to a single target language.

I understand there will be an improvement when the grouping separator is used, but even for decimals, I cannot reproduce the conversion that is supposed to happen when the setting is enabled. Am I missing something?


At least for numbers which include grouping and/or decimal separators, I think there should be an easy way (F4, if they can be recognized as non-translatables) to transfer these and have them converted into the target language formatting (maybe providing more than one formatting option, like at least source/target formatting. For example, the year do not take the grouping separator in French, its 2021, non 2 021). 

Other common numbering formats could include dates, currency amounts (to include the currency symbol. for example, or €1 is written 1 € in  French,with a non-breaking space, same if euro is written as EUR), etc. which calls for CafeTran to recognize that this number is a date (and that the / separators are part of the number) or that the currency symbol or shortened currency name is part of the number too, even where there is no space before or after the number.

For translators who are routinely dealing with numbers and need to rely on the ability of a CAT tool to quickly transfer these numbers and have them accurately converted according to target language rules, improving how CafeTran deals with this aspect could make a big difference!

Let's see what improvement in this regard will bring the next CTE update.

1 person likes this
Login to post a comment