Start a new topic

Easiest way to convert CafeTran .txt glossary with synonyms into a TMX?

Does anyone here have any tips re easy ways of converting CafeTran TXT glossaries with synonyms into TMXs?


Michael


1 person likes this idea

MB: They look around in the menus for a Converter, but cannot find one.


You can easily convert between the two kinds of termbases, it's just that that conversion txt>tmx  can't handle ss and ts synonyms and regexes. And I don't think that's a problem you can solve easily, especially not the regexes.


H.

@woorden: Leaving aside regexes for the moment, I think creating a Converter that respects source and target-side synonyms would be trivial for Igor. I mean, if I can do it in five to ten minutes using Ron's Editor and a text editor, how hard could it really be?

MB: Leaving aside regexes for the moment


I don't think you can do that, but anyway, if it's that simple, I think you can explain the process in the Wiki, or even write a script for it. I think I can write one for the Mac users (not for the regexes, though).


H.

Hi,

I'm writing an Excel macro/VBA for the users who are not familiar with those "script" stuff, that converts a glossary into one with synonyms in both the source and target segments split while retaining (duplicating) all the subsequent fields (context, note, et cetra).

I hope I will succeed.

Cheers,
Masato

 


1 person likes this

Thanks Masato!


One small question before you begin: will your solution in Excel respect the UTF-8 encoding of the CT resources? I know that Excel can be a bit suboptimal when it comes to UTF-8 (and end up trashing all manner of special characters), which is why I always use a CSV editor like Ron’s editor (which, however, of course doesn't have macro/VBA capabilities).


Michael



Hi, Michael

You are right. According to my experience, you'd better convert your glossary into UTF-16 format (or else) somehow; otherwise (in UTF-8), not all the data may not be imported successfully, which I don't know why.

Peace,
Masato

MB: I know that Excel can be a bit suboptimal


That's why I switched to LibreOffice Calc recently (and I think I mentioned it here). Calc seems to have it all: UTF-8, can save to tab del (Numbers can't), accepts macros. Do check the latter, as I don't use macros.


H.

>Do check the latter, as I don't use macros.


You can do it with a formula too, but you have to drag it down all the way.

UPDATE:


  • I tried the Perl script. I don't understand Perl, but I should be able to run it anyway. It won't run
  • The awk script does run, and it runs successfully. But I think you'll have to run it several times (when to stop?)
  • A regex looks OK, but again, I think you'll have to run it several times


In other words, it's not particularly simple, and I've only been dealing with source side synonyms. An MB glossary (which, for obvious reasons, I had to create - and I called it Michael Syns) can also contain target side synonyms, regular expressions, and sentence patterns. You should be able to delete the latter two quite easily, but I'm afraid there's little else you can do. Aspirin.

H.

Wow, thanks for trying Hans! I'm so sorry you had to create a Beijer Glossary. It must have been painful ;)

Hi All,

I've successfully made an Excel macro for splitting synonyms, retaining all subsequent fields.

Now I'm writing a supplemental macro for statistical analysis, so I hope that I will be able to offer the package to you in a week or so,

Have a nice day!
Masato

 

Hi Masato,


Cool! However, you are aware that Igor added something similar to the latest version, right? Your supplemental macro for statistical analysis sounds very interesting though!


Incidentally, your macro reminded me of an idea I had a while back, which others here have also had I'm sure: it would be cool (sorry, I know I probably shouldn’t say "cool" twice in one post) if CafeTran had some kind of plug-in system that would allow third-party developers to contribute macros etc. to CafeTran. Basically kind of like the SDL exchange thing. This would allow people who hate to bother Igor for new features (not me ;)) to suggest or create their own functionality, which could then be shared with all of us. Just an idea.


@Igor: any new re the updated version of this feature that properly handles all synonyms?


Michael



MB: This would allow people who hate to bother Igor for new features (not me ;)) to suggest or create their own functionality,

 

Good idea. Should be a new/separate section of Freshdesk, though. I'll contribute, for example, with my (Mac only) solution for dropping files on the Dashboard. If you run CafeTran, the folder that contains the current job opens, and stays frontmost, solving the problem of resizing the CT window or opening a folder "manually."

 

 

 

Will record a new screencast, of course.

 

H.

 

Hello, everybody

If you are interested in my macro, please see the Tips and Tricks section.

M,

 

Login to post a comment