Start a new topic

Poor man's regex tagger

I'm trying to create a workflow to be able to hide invariant and nuisance strings in both native CafeTran Espresso 2019 and third-party projects.


In CafeTran Espresso 2019's native MS-Word projects you have maximum flexibility to hide what you don't want to see (duh).


However, in InDesign, FrameMaker etc. projects, you don't have this feature.


And in third-party projects it's neither possible. Hence the need of a way to get this disturbing info out of your project.


NOTE: This solution is only attractive for projects that don't contain (many) tags.


Suggested solution:

  • Create a native CafeTran Espresso 2019 project or open a third-party project package.
  • Copy all source segments to the target side.
  • Export to bilingual.
  • In MS-Word: hide the first and second column.
  • In the third column, use wildcard searches to hide everything that you don't want to see in CafeTran Espresso 2019.
  • Create a new project and import the manipulated bilingual export table.
  • Export after finishing translation.
  • Unhide everything in MS-Word.
  • Import the table as a TM.
  • Use the TM to translate the Studio project etc.
Possible improvements:
  • CafeTran Espresso 2019 lets you import a bilingual export document to a TM.
  • Dump the current project (the translated bilingual export document) to a TM while unhiding all hidden info. This would be the perfect solution.


image


image

image


image


image

image

image

So: propagation is only limited to very similar non-translatables. This could be improved. Note that I didn't define the strings as non-translatables. But I don't think this will fix propagation.


The work around via MS-Word hides all invariant info. Propagation is optimal.

docx
(11.8 KB)
docx
(11.7 KB)

About how to import the table as a TM:

  • Open the tab-delimited text file via Memory > Open memory... 

This is how tags are represented in CafeTran Espresso 2019's native project's bilingual documents:


image

Let me investigate whether something similar can be simulated for Studio projects.


>Let me investigate whether something similar can be simulated for Studio projects.


Well, of course you can always create the bilingual table in Studio, when you have access to this tool, and translate the table in CafeTran Espresso 2019.

Here you see an animation showing two TMs: one created from a native CafeTran Espresso 2019 project and one from a Studio project from the same text:


image


Just as expected: the TM looks the same, including the storage of the location of the tags.


This brings me to the following idea:


If tags in bilinguals for Studio projects are represented as either a pipe or one hidden project, this will have many advantages:


  • Better orientation during reviewing/translating the table in Word (reviewer sees where the tags are positioned).
  • Nice roundtrips possible, e.g. for the Poor man's regex tagger scenario.
Note that I don't request reimport of the tags in Studio projects (yet).

And then I remembered the existence of the Tortoise Tagger which can be used to tag the bilingual export documents: http://www.nemadeka.com/tagger.htm

>Tortoise Tagger


Of course the workflow has to be adapted to CafeTran Espresso 2019. For instance, the font attribute 'hidden' has to be used instead of a T4WIN style. I'll come back on this ...

Tortoise Tagger will probably be overkill for the relatively simple task of tagging bilingual export documents.

Login to post a comment