Start a new topic

Untranslatables: How To handle them if they have special characters?

 Hello,

a nice time has passed and I got familiar with a lot of CT features and workflows. On My latest project I have imported a SDLXLIFF file that contained a Bilingual Excel and almost no tags had to be handled even with coded expressions that are untranslatable.

The untranslatables that I usually deal with are not of the type 'Mr. Poirot' or 'Mrs. Marple', but more this ugly types: {MQ}[1] or [Bullet]1 or sentence starts 8.4.3 etc.


Now, I know that you can add untranslatables with ALT-P and use F4 to show and insert them, the problem is: it didn't work with my project, if I hit F4 I get a window with numbers only and no expression is listed, although I already have saved entries like simple multi-words like 'Adobe Reader 7.0' or such. So it seems there is a problem with the current 2017-Harbinger version reguarding this. Or do I possibly handle it wrong?


I really hope that CT - if not already - will recognize in the future untranslatables that contain special characters like [ ] { } or bullets and such special text parts, because that's what happens a lot in my projects, mostly bilingual exports from other CAT tools contain such codes that I cannot bypass or ignore. Simple untranslatables made by letters only can simply be saved in the glossary. But the glossary will not recognize special characters either:, if I save [1](PPP)[2] in the glossary (because it often happens in my text as a placeholder for a company name), then only PPP will be inserted if I double-click on the glossary entry.


So, is there a solution for quickly inserting such untranslatables with 1-2 keystrokes and without selectiing them in the source, which takes too long?

How do YOU handle untranslatables?


Thanks for a feedback!

Greetings,

Mike




Mike: I really hope that CT - if not already - will recognize in the future untranslatables that contain special characters like [ ] { } or bullets and such special text parts


It looks like some of these characters will need to be escaped. That would require regexes in a tab del glossary. Alternatively, you can probably exclude them from "Do not Match" or "Trim..." in the Preferences for Memory and Glossary resp.


Since I don't use tab del glossaries, you'll probably have to wait for a better solution...


H. 

Thanks woorden,
I think will just search / replace in the original text as 'aa', 'bb', 'cc', etc. and change back at the end. I have found out that you can save such strings in the glossary or as fragment in a special memory, then if you select them with mouse they will be inserted in the target, but that's still too long to do if you have many of them.
It would be fine if CT had an Autocorrect feature where I could define for example
aa = [bullet][1], bb = {MQ}[1], cc = long frequent term
and CT would replace these strings while typing. Maybe there is such a function?

MIke

 

Ah! I must try Ressources > Text Shortcuts !

Mike

 

Mike: I really hope that CT - if not already - will recognize in the future untranslatables that contain special characters like [ ] { } or bullets and such special text parts


There are so many possibilities for such complex constructs that it is nearly impossible to hardcode them. The best idea to tackle them is via the regular expression.


In CafeTran, precede the regular expression with the pipe | character. For example, you might add the following to the untranslatable list:


|\([A-Z]+\)


to indicate that you wish to catch any upper case phrase inside the brackets e.g. (NASA)


Note the you can also create a glossary of untranslatables. Then, you don't need to put the | character in front of the regular expression.


BTW, I recommend updating CafeTran to the latest 2017 Yeddi version.


Igor

>Note the you can also create a glossary of untranslatables.


Thanks for reminding me! I've been wanting to optimise my workflow with different lists of non-translatables per client.


Now I can do so. Via glossaries. Must come up with a naming convention for this. ntf_müller.txt or so.


Just great.

From the Dashboard I can select the client TM, the client glossary and the client's list of non-translatables when I create a new project. Very happy with that.


What's a non-translatable for one client, is a word that should be translated for another client.

Thank you all for these indepth advices. It's good to know that the untranslatable list as well as the glossaries will accept RegEx, I'll take note of this. Meanwhile I have used the 'Text Shortcuts' and it works also like a charm, also with regex characters that I don't  need to escape.

Time again [and again] to consult my regex notes :)

Greetings,
Mike

 

Great that you found the optimal solution. I hope you are having fun with CT.

Login to post a comment