Start a new topic

Should I use the pipe here?

I know almost nothing about regular expressions, but I would like to make my glossaries as compact as I could with regular expressions. For example, how could I instruct CT to show "depending on" for any of the following source terms?


a seconda di

a seconda dei

a seconda degli


Thanks in advance


Yeah, while I was waiting for your answer I reconsidered everything and came more or less to the same conclusion, at least for now, since I have no sufficient knowledge about regular expressions nor time right now for experimenting.

I think you are trying to achieve too many things in one entry with regular expressions. I would use regular expressions for the source side of the entry only as the program catches it on the source side only. The target side should be a regular term. In this case, source and target side synonyms might work such as:


pino giapponese;pini giapponesi  TAB momi fir;momi firs TAB コメツガ



Igor: CafeTran shows the glossary term as it is entered in the glossary while you can see the caught term by the regular expression in the source language editor.


Please help me understand if I am understanding these things correctly. My idea would be to use regexes to automatically cater for both singulars and plurals. Then, if my trilingual glossary contains:


pino giapponese    momi fir    コメツガ


and I want to cater for plurals too like this:


pini giapponesi    momi firs    コメツガ,


I should edit the glossary like this:


|abet(e|i) giappones(e|i)   |momi fi(r|rs)   コメツガ


Indeed, when actually translating in the glossary pane the following shows up:


For momi fir: momi fir(r|rs)  and  |abet(e|i) giappones(e|i)

For momi firs: the same as above


Is this correct?


But when I start typing in the editor, CT suggests "abet(ei) giappones(ei), and if I accept it by pressing Enter, then I should edit it to make these two words singular or plural as necessary.


Is my understanding correct? Or, as Woorden seems to suggest, I should abandon this idea altogether and use fragments from memory instead? (Woorden: did I understand correctly what you mean?) 

Igor: blabla


You know what, Igor? Goodbye, and thank you for all the fish.


H.

> Igor, are you still perfectly sure that "the user" can "easily modify"?


Yes, that particular regular expression is easy to change (e.g. the start position and length of characters (from 1 to 4).


> it's perfectly possible to use CafeTran without regexes, scripts, and KeyBoard Maestro macros.


Of course, it is possible. Nevertheless, if the users apply some external (outside CafeTran) tools to enhance their individual workflow, there is nothing wrong with it as long as they know they are doing and don't expect that such workflow extensions are supported, because the support may have no knowledge of external macros, scripts etc. For example, there is not much I can do if a user decides to connect CafeTran to his espresso machine and make coffee every two hours by some kind of a system macro or script. I saw people throwing axes to targets and they had so much fun with it.


> extremely harmful for CafeTran


I don't know it. If I was a translator with no interest in any "weird" macro/script codes and saw a macro or a script, I would definitely avoid it. So if there exists a simpler and more intuitive solution, it should be provided in the first place. On the other hand, some power users who enjoy tinkering with their system tools might appreciate it - just saw a script posting today by CafeTran user on Linux.


IK: ...the user can easily modify


Mario: But, the glossary now shows "|a seconda d.{1,4}?" instead of "a seconda d" in the Italian column


Igor, are you still perfectly sure that "the user" can "easily modify"?


By the way, I'm going to post in the CafeTran section of ProZ an article that will claim that it's perfectly possible to use CafeTran without regexes, scripts, and KeyBoard Maestro macros. I have this idea that the majority of the previous postings on that forum concern that subject (and in a very wrong way), and are extremely harmful for CafeTran. (The posting cannot use any politically incorrect items, nor F-words, so I'll have to rewrite it to make it lethal anyway).


H.

Thank you. I'll ned to get used to that, but most important is that regexes work (at my level, at least).

Yes, CafeTran shows the glossary term as it is entered in the glossary while you can see the caught term by the regular expression in the source language editor. 


1 person likes this

For Igor: your "|a seconda d.{1,4}?" solution works well, thank you. But, the glossary now shows "|a seconda d.{1,4}?" instead of "a seconda d" in the Italian column. Should it be so? 


1 person likes this

And what with all this talk about pipe dreams?


woorden: The basics are there, Kmitowski and Dimitriopolos took care of that.


Well, I guess that would be Igor and me. You should've seen my regular expression when that degree of TM fuzziness hit me full face. It was quite a spectacle.


IK: the user can easily modify


Thank you.


H.

> are you perfectly sure your regex won't catch any four letter words


It will catch maximum four letters after "a seconda d". If the scope is too broad, the user can easily modify that reg. ex based on the provided example.


1 person likes this

@woorden: Regarding how little you trust the use of regexes in term matching, have you ever stopped to consider that you are implicitly placing a lot of trust in the fuzzy matching algorithm used by CafeTran when matching terms in its TMXs? After all, what are regexes but yet another algorithm?


Fuzziness can sometimes mean errors (as you well know), yet you are constantly singing the praises of fuzziness, but only if used with TMXs.


image



Michael

IK: The following regular expression should catch all your examples


I have no doubt it will. I don't know any Italian, not after say 400 AD, but are you perfectly sure your regex won't catch any four letter words (and I'm fond of them) or less that would be caught as well?


H.

Michael: explain the basics of CafeTran terminology before you do so.


The basics are there, Kmitowski and Dimitriopolos took care of that. I wish I could write a "Why" rather than another "How to". And I'm serious, for once.

H.

Login to post a comment