Start a new topic

New QA consistency check

I'd like to request a new QA consistency check, that's especially useful for MT translations.

Here are the screenshots. The description will follow later this morning.





One of many problems with MT is that it misses consistency between segments. I assumed that QA fragments (Memory) could be of help here, but obviously it isn't (unless I made a mistake in my test).

I created a dedicated memory, making sure that only fragments were stored and the Fragments consistency check checkbox was selected:


No errors were reported:


So, now I'd like to propose a new QA that reads all segment pairs where the source doesn't contain more than 1 word to a temporary QA consistency check glossary and then uses this glossary to execute QA Terms consistency check (Glossary):


Examples of segments that contain no more than one word:



The reference reads:


So, shouldn't this QA have detected the inconsistencies?

BTW: My assumption for this QA is that the translation of isolated words, as in legends, technical drawings, part lists etc., are of ‘higher value’ than the translation of the same words in body text. As I said, this is an assumption. Anyway, the procedure will allow you to spot inconsistencies. Finding / selecting the optimal translation for these words and inserting them is step 2 (and 3).
I now realise that by configuring the memory as fragments-only, those segment-covering words are excluded. So my proposal would indeed cover a new type of QA.

Regarding QA > Consistency checks > Fragments consistency check (Memory), have you tried creating a new fragments-only memory?

I don't think setting an existing segments/fragments memory to fragments-only will work. It has to be a termbase (fragments-only) from the start.

If this does not work for you, I think there is an additional way to achieve what you seek: QA > Word lists.

Hi Jean,

Yes, I had created a new memory. Just to be sure, I repeated the test with a fresh project and created a Fragments-only memory

The QA check didn't catch the double translations for Gleisfahrzeug and Gerät.

In case you want to replicate the test, I've attached the CTP package.


Adding one of the two terms for "Gleisfahrzeug" as per the OP screenshot, I have reproduced the issue. No fragment/segment is being flagged by this QA step, contrary to what I would expect.

I wonder why you add full segments to the fragments only memory in spite of setting it as "Fragments memory".  CafeTran assumes that TM segments covering the checked fragments are valid and does not raise a flag.

The possible general implementation to cover such two-step QA checks might add a single option "QA for filtered segments". The user would apply any filter (e.g regex based) they wish,  and then the perform a QA task.

1 person likes this

While one should not add full phrases and sentences to fragments, a segment can be comprised just one word and coincide with a fragment/term.

In other words, the translator may store a fragment, which happens to cover a full segment in a current or future project. This is something the user cannot control or know in advance, but they would still wish to check for consistency.

Unless a fragments memory is only for storing subsegments, not terms (that can be subsegments or segments, depending on the situation).

For example, if I add the word  "Introduction" as a fragment, I will probably encounter it as a segment at some point, even if it was part of a phrase in the current translation.

Glossary building assignments are another example (segment=term).

CafeTran itself offers a "Word" segmentation rule, so this is not far fetched, and a project segmented thusly would present the same situation..

For a TM storing segments, the default user action is to save the segment and go to the next one. For storing fragments, the addition of a fragment is an intentional, manual/separate action.

So, for an fragment/term-base, it would make sense that each intentionally entered fragment is being checked against the project segments/subsegments. I don't feel CafeTran should necessarily assume that segments can never coincide with a subsegment (fragment) entry.

If a general implementation can cover such use-case, then it's fine, although it would have been better if the "default" fragment consistency check didn't skip segments (which can be anything, even a single word) that match a stored fragment.


>So, for an fragment/term-base, it would make sense that each  intentionally entered fragment is being checked against the project  segments/subsegments. I don't feel CafeTran should necessarily >assume  that segments can never coincide with a subsegment (fragment) entry.

Nonetheless, if you get the exact match for the project segment in the TM you are QA-checking against, CafeTran gives the high priority to that exact match (not flagging any inconsistencies in that 100% matched segment). It comes allegedly from the quality TM. Otherwise, you would distrust the TM quality, flagging lots of "accepted" (just by the given TM) inconsistencies.

I was talking about a fragments-only termbase (where no full segment is intentionnally stored, even if a stored fragment may match a future full segment).

I would expect (or like, if you prefer) all segment's content to be checked against my intentionally stored fragments (subsegments/terms that I have entered), even if a full segment matches a stored fragment.

 If a segment pair contains a fragment that is not consistent with entries in my fragments translation memory, I would like CafeTran to flag it, just like a glossary concistency check would. Wouldn't it?

In the CTE 10.7.2 update, the exact matches for the QA selected translation memory are also checked for included fragments. It may slow down this QA check a bit, though.

Login to post a comment