Start a new topic

Improve surrounding of quotes with tags

If the source has tag"word"tag and the target uses the same type of quotes, it would be great if CafeTran Espresso would place the tags around the quotes in the target too:


tag"wört"tag


In a flash of genius I decided to define |\" as a non-translatable, since CafeTran Espresso correctly adds tags to non-translatables.


However, perhaps because of the US International keyboard layout with dead " keys, superfluous spaces are added.

For the developer: note that this structure


tag"word"tag


is used very often, also with other punctuation marks:


tag[word]tag

tag(word)tag

Perhaps something can be added to this menu:


image

E.g. by letting the user define a tag sequence in the Prefs:


image


Of course, the tag numbering should be correct.

Just switched my keyboard layout to US (no international) just to eliminate the influence of dead keys.


Same problem: superfluous space is added after insertion of |\"


image


image


I have this macro that surrounds numbers and numbers in brackets with tags.


I have another macro that surrounds words between quote characters with tags.


Things start to get complicated (although they still are realistic) when a segment contains numbers in brackets and words between quote characters, since the tag numbering must consider both types of entities. (I haven't realised this yet; perhaps it'll be too complicated for me to fix.)


Here are two segments:


image


I think that CafeTran Espresso should detect that in the second segment, the fuzzy match, the entities (numbers, quotes) are identical.


When I insert the FM, the tags aren't there. This is what the correct translation is:


image


I inserted the words 'ander' and 'ook' and had to redo the tagging manually. That's because CafeTran Espresso only stores the tag positions in the memory. This has advantages (clean memories etc.), but for fuzzy matches there is a lot of information lost, resulting in extra, not very interesting work.


Here's an idea, and I'm not sure whether it's technically possible:


Can CafeTran Espresso be enhanced to remember that fragments are surrounded (lef, right, left+right) by tags? Not per se the tag number (since this can change) but the fact that a tag is present at the left side, the right side or at both sides?


So for this example segment:


–tag

the translation of tag"Beispiel"tag

tag(1)tag

tag(2)tag

the translation of tag"Aufzählung"tag


You might try to put the following regular expression into your non-translatables to catch a number surrounded by quotes or brackets


|[\"(]*\d+[\")]*


During the transfer to target, CT should carry over the surrounding tags as well. As for the regular terms or fragments, the new feature in the next update called "Term Patterns" should handle it. This will enable the user to construct the terms with invariant or non-translatable parts. For example, if you put the following into your glossary:


"{1}"="{1}"


CafeTran will translate any term included in your glossaries along with the invariant quotes.


1 person likes this

Your suggested solution with |[\"(]*\d+[\")]* works fine for auto-assembling and inserting a DeepL suggestion:


image


(It doesn't work when inserting a FM.)


So far, so good.


When I add the other tags (1st one manually, the other ones automatically), of course the tag numbering isn't consistent:


image

CafeTran Espresso complains, but will correct the tag order. If you return to the same segment, you'll see:


image


When you look at the tag content, you see:


–<x1/> Das ist noch ein anderes <x2/>"Beispiel"<x3/> mit einer <x4/>(1)<x5/> Nummerierung und <x6/>(2)<x7/> einer <x8/>"Aufzählung"<x9/>.

–<x1/> Dit is nog een ander <x2/>"voorbeeld"<x3/> met een <x4/>(1)<x5/> nummering en <x6/>(2)<x7/> een <x8/>"opsomming"<x9/>.


However, when you run a QA for tags, you get errors for all 4 segments:


image


Why's that?


Despite the reported tag errors, the target document could be exported perfectly:


image


Although the Word document is correct, the targets in the grid aren’t: note the incorrect italics.

Let's wait what the new build will bring us :).


Eagerly awaiting it ...

Login to post a comment