Start a new topic

REQ: In-project target term consistency

It would be nice to have two new features:

  • QA In-project target term consistency
  • Interactive In-project target term consistency check

Explanation:

During translation, segments that adhere to a certain pattern (e.g. no spaces or punctuation marks, possibly initial letter in uppercase – further to be examined) qualify as hi-priority term pairs.

These automatically harvested term pairs are being used to check in-project target term consistency in the rest of the project, in two ways:
  1. During a QA after having finished the whole project (or at any chosen time in-between)
  2. While interactively translating, possibly with assistance of MT systems

A little more info about the interactive modus:

In segment 23 the German word 'Motorhalter' is translated with 'motorhouder' (motor support), since the translator can see that this segment refers to a legenda to an image in a manual. She can clearly determine the correct meaning of the source term and select here preferred target term.

In segment 45 a high-fuzzy or exact match from the legacy TM is inserted, containing an incorrect target term (motorsteun). CafeTran Espresso 2019 flags this and displays a link to segment 23. The translator can either change the target term in segment 23 or adjust the target term in segment 45 (either manually or automatically).

In segment 347 a translation from an MT system is inserted that also contains a wrong target term (motorsupport). CafeTran Espresso 2019 flags this and displays a link to segment 23. The translator can either change the target term in segment 23 or adjust the target term in segment 347 (either manually or automatically).

So, where should this flagging take place? As a pop up when navigating to the next segment?


> segments that adhere to a certain pattern


In the end, this is a kind of fragment consistency live check. Sounds really interesting. But why exclude segments with spaces and punctuation? On the one hand side, badly segmented files can contain segments that do not correspond:


These are

segments

that do not correspond to

each other. 


Not only with OCRed files, but also likely in PPT. From this pov it would be okay to exclude this kind of segments. But what about French (and Spanish) Fries?


bloc moteur

mise en conformité:

cable del capó;


How can we then differentiate fragments from important terms? Mark these segments somehow as "emblematic" or "important"? Most often I do create TB entries on the fly, so a term consistency does its job. 

>But why exclude segments with spaces and punctuation? 


Like I said, a pattern has to be defined, very likely, depending on both source and target language.


For some languages this will be possible, for others, alas, not.


For these, there will always be the manually driven checking of user-defined term pairs.

> Like I said, a pattern has to be defined, very likely, depending on both source and target language.


Indeed, but I assume Igor – in general – is not very likely and very careful to introduce more user-defined patterns ("anything that can be set can break the whole program"), and it might open one more box of the pandora, such as the "Do not match" field. Perhaps a kind of "up to x characters or y words" sort out can be useful.

>Indeed, but I assume Igor – as the general – is not very likely and very careful to introduce more user-defined patterns ("anything that can be set can break the whole program"), and it might open one more box of the pandora, such as the "Do not match" field. Perhaps a kind of "up to x characters or y words" sort out can be useful.


Like I said: to be examined ;). Hey, what are weekends for? It's cold outside, so a nice task inside, at the fire place, is always interesting.

Login to post a comment