Sentence patterns coming into CafeTran

Hi,


The next update of CafeTran will see a new enhancement to auto-assembling called "Sentence patterns". It will allow translators to create translation memory segments with variables in the following form:


All the leaves are {1} and the sky is {2}. = Todas las hojas son color {1} y el cielo es {2}.


Then CafeTran will be able to use terms in glossaries or fragments in translation memories to replace the variables with the found entry, creating a complete translation.


The feature will also let the user set the default exact match for the pattern such as:


All the leaves are {1=brown} and the sky is {2=gray}. = Todas las hojas son color {1=caféy el cielo es {2=gris}.


The order of variables is not fixed so the function may be really useful in auto-assembling when the translation of a sentence has the variable lexical elements in a completely different order.


I call this new improvement "Sentence patterns" as suggested by a user but it would be interesting to know an alternative (or perhaps a standard) term for it.


Igor


4 people like this

Will "multi-step (or reflexive) auto-fill and auto-assembling" be possible?

For example, when you have the following glossary entries, TUs or whatever, separately in your resources:

On September {1}
{1} released {2}
Apple
a device called {1}
iPad Pro

You have this source sentence:

On September 10, Apple released a device called iPad Pro.

Can this new feature generate the following result automatically?

On September {1=10}, {1=Apple} released {2=a device called {1=iPad Pro}}.


Peace,
Masato

Igor: What you describe is more like a "term patterns" feature than "segment patterns". This is an interesting extension to the current functionality. I am going to look into it.


I think that's even more dangerous than "segment patterns," comparable to regex matches.


Take for instance:


On September 17, 2015, Igor introduced Cloze Matching

In September 2015, Igor introduced Cloze Matching

September 2015: Igor introduced Cloze Matching

September 17, 2015: Igor introduced Cloze Matching


Op 17 september 2015 introduceerde Igor Cloze Matching

In september 2015 introduceerde Igor Cloze Matching

September 2015: Igor introduceert Cloze Matching

17 september 2015: Igor introduceert Cloze Matching



If you leave out the preposition, you're dead meat. "September {1}" won't do. Capitalisation won't do either. Not in all "cases" anyway.


The very last Remark/Opmerking in the Wiki also points to a problem. Well spotted! A 100% match can be wrong. The good thing is, that CT doesn't "jump over" those matches if "Jump Over... | Exact Memory Matches is enabled.


That said, Cloze Matching can be a huge time saver, if applied correctly. It's probably subject, or even document/job specific.


His Nastee Olde Fartness,


H



One more potential problem for "term patterns": What happens if two (let alone more) of those patterns occur in the same segment? Not unlikely.


On September 17, 2015, Igor introduced Cloze Segment Matching, and on September 20, 2015, he added Cloze Term Matching


On September 17, 2015, Igor introduced Cloze Matching, and colourless green ideas slept furiously ever after


September {1}

{1} ideas


H.

How about 'placeholder framing'?

I'm dreaming.


What's the difference with subsegment matching?


H.

>What's the difference with subsegment matching?


I think: the fact that you can 'hard code' the content to be inserted.


@Igor: Sounds like a very nice feature!


Hans Lenting

Okay, so what's the difference with a termbase entry like:


today annonced, that


(used in just about every press release)


H.

Auto-assembling with subsegment matching alone builds the target sentence keeping the order of subsegments found in the source sentence. Sentence patterns give a translator a finer control over autoassembling as she can define precisely her own subsegments order and placing in the target segment.


Igor

OK, but in this example, what would be the diffrence between Sentence Patterns, and termbase entires like:


all the leaves are

and the sky is


It looks like a novilty that would fit perfectly in my workflow, except that I don't understand it.


H.

Wait a minute, the translator (who is a "he" in my case, and politically inccorect at that) doesn't have to do anything to achieve the result. Is that the bonus?


H.

Having one sentence pattern, CafeTran will create the correct target translation no matter what stands for the variable part in the source sentence. So the colors of the leaves and the skies in our example can be changed in the source sentence, and yes, the translator does not have to do anything if he has a glossary or translation memory of colors.


Igor

But, but, do you have to mark the sentence pattern in anyway? And if so, again, what's the difference with adding fragments to your termbase? If not, it's subsegment matching, the fuzzy way?


Mama Cass. Dead. Dammit. What a voice. Had to listen to a number of songs because of this.


H.

Candidate names, considering the fact that English/Japanese translation packages call a similar feature as "sentence pattern matching" or "hole filling feature":

1. Auto-fill
2. Pattern matching
3. Trans-modeling (meaning translation based on sentence models)

4. Missing-link finder

5. Hole filler


Peace,

Masato

Hi Masato,


What you describe is more like a "term patterns" feature than "segment patterns". This is an interesting extension to the current functionality. I am going to look into it.


Igor 

Login to post a comment