Sentence patterns coming into CafeTran

Hi,


The next update of CafeTran will see a new enhancement to auto-assembling called "Sentence patterns". It will allow translators to create translation memory segments with variables in the following form:


All the leaves are {1} and the sky is {2}. = Todas las hojas son color {1} y el cielo es {2}.


Then CafeTran will be able to use terms in glossaries or fragments in translation memories to replace the variables with the found entry, creating a complete translation.


The feature will also let the user set the default exact match for the pattern such as:


All the leaves are {1=brown} and the sky is {2=gray}. = Todas las hojas son color {1=caféy el cielo es {2=gris}.


The order of variables is not fixed so the function may be really useful in auto-assembling when the translation of a sentence has the variable lexical elements in a completely different order.


I call this new improvement "Sentence patterns" as suggested by a user but it would be interesting to know an alternative (or perhaps a standard) term for it.


Igor


4 people like this

You have to mark it if wish to point precisely where in the target segment, the variable part/fragment should go. As I said, the patterns give a translator a full control over auto-assembling. The translator steers the placement of the fragments in the target sentence.


Monday, Monday :)


Igor   

IK: Monday, Monday :)


It is. But it never rains.


But I still don't get it. How do you mark it? {1}? Where, in the TM? How do you do that?


H.

You will be able to mark it as shown in the announcement. The patterns can be set either in txt glossary files or TMX memories. I will provide the full description of the feature in the Solutions section along with the release of the update.


Igor

Hmmm.


H.

Hi Igor,

Actually, I don't know much about programming, so this suggestion from me might be a "mission impossible," which I've never seen fulfilled in any of the translation-related suites I've tried so far.

Thank you for attending to me!

Cheers,
Masato

Hi Masato,


You can create a pattern as long as you wish, with as many variables as you like, but variable numbers cannot be repeated in a pattern, as you described it. Also, CT treats the whole segment as a match to defined patterns.


CT uses the results of its own MT translation engine processing - local (terms, fragment ans subsegments) to auto-fill.


Igor

Hi Igor,

It means that results from Google and other online engines will not be taken in, unlike the current feature of improving AA with them?

Maybe I can find out more about this new feature when it is available.

Peace,
Masato

 

Sound like a great feature - and it's also great that you can use it with both TMX files and TXT glossaries.

MBr: Sound like a great feature


But, but, you don't even use AA, do you? Because you concentrate on legalese...


H.

> It means that results from Google and other online engines will not be taken in, unlike the current feature of improving AA with them?


No, they are no taken in yet. It will be the first release with this auto-fill concept. I may add the external Machine Translation filling later.


Igor

With the new feature, AA might (!) produce many more usable results for me.
Besides, the new feature is much easier to use than regular expressions (which I haven't used for glossaries so far, only for segmentation rules).

What would be great too is if the new feature would work not only for AA but also for glossary matches.
For example, if the glossary contains the term "nachfolgend {1} genannt" and the text contains the words "nachfolgend KÄUFER genannt", will the glossary term be shown in the glossary tab?
(BTW, this particular entry with the sentence pattern would also produce a usable AA result for me in most cases).

 

Instead of the "100%" pattern match indication on the AA panel that looks somewhat colorless, and is sometimes confusing at least for me, how about simply displaying "Pattern Match"?

 

My suggestion for a name is "Puzzle match".

Good luck.


MBr: the new feature is much easier to use than regular expressions


Regexes are for the pros, but I suppose the new feature will present the same problem: Unexchangeability. I'm still cleaning old DV mdbs/tdbs for {n}. So I think you should only use them in a termbase or glossary (if you don't want fuzziness and other goodies) exclusively for Sentence Patterns.


With the new feature, AA might (!) produce many more usable results for me


If AA (either inserted in the target language pane or not) does not produce usable results,

  • Your source language isn't suitable, e.g. a highly agglutinative language (not your case, I'd say)
  • The words/phrases simply are not in your resources
  • You use the wrong settings. (An overview of CT resources)
  • The subject matter isn't suitable (highly creative texts - for which I still use CT and AA), not your case either. Legal texts are perfect for AA


In all other cases, you should benefit from AA. To the point you don't have to look up a single word/phrase.


H.



One of my fantasies...


There is "Refine AA" button in the segment toolbar.


Every time you move to the next segment, the current segment pattern matching feature is activated, while at the same time (or in the background), doing the same thing for partial elements (such as On September {1}).


Then, you click "Refine AA" button.


CT now auto-assembles only exact glossary entries and TM fragments gained from the default segment pattern matching into one complete whole.


Login to post a comment