Start a new topic

REQ: Automatical setting of non-translatables for better match repair

When working with technical manuals, you often get a kind of the following, quite similar sentences.


Case 1:

For this procedure you need a thermocouple (part no. 1234567) and the transmitter (part no. 1234568).

For this procedure you need a thermocouple (part no. 1235467) and the transmitter (part no. 1234568).


In these cases CT works fine and replaces the numbers for the translation of the second case.


However, part numbers often are not only numbers, but a combination of numbers and letters. 


Case 2:

For this procedure you need a thermocouple (part no. 12H4567G09) and the transmitter (part no. 12J4568F23).

For this procedure you need a thermocouple (part no. 12K4547G19) and the transmitter (part no. 12J4TS8F13).


Here CT fails to do a match repair as above. If CT was able to identify any combination of letters and numbers as non-translatable, it would be able to process Case 2 in the same way as Case 1 (actually it does not). Sometimes the segments are short enough not to produce (or hardly to produce) any TM hits, eg.


Thermocouple (12H4567G09)

Thermocouple (12K4547G19


This would be a big progress for these kind of segments. And indeed, this kind of number-letter-combination is hard to fetch with RegEx.


This should only be ad hoc, to avoid NT check errors with things like "220V", "220VAC" or "50Hz" (yeah, the world would be a better place if any language would put blanks there).

Login to post a comment