Start a new topic

Excel macro for de-merging your glossary

Glad to offer you an Excel macro that de-merges the source-side synonyms.


CafeTran already has a similar feature, though: the only difference, or an additional feature that I've been long waiting for, is to demerge both sides while retaining (duplicating) all the subsequent fields (context, note1, note2, etc.).


This macro can do this.


For further information, please see Readme.pdf.


It may take some time for this macro to finish: please wait over a cup of coffee.

Cheers,
Masato
zip

Great initiative, Masato! Highly appreciated


The maximum number of rows in a spreadsheet is 1,048,576 for Excel 2007, 2010, and 2013. Before demerging, make sure that the number of resulting rows will not exceed the cap.
For example, if your glossary has 100,000 entries (rows) and, on average, three synonyms on each side, then after both sides are demerged, it will have 900,000 rows (100,000 * 3 * 3).


Impressive :) 


1 person likes this
This macro will be useful if:

1. You want to use a CT glossary (with notes) for a CAT tool that does not detect synonyms separated by a semicolon (such as SDL Trados, OmegaT, and Wordfast).

2. You want to reverse a glossary (swap the source and the target) for reorganization by target entries.

3. You want to cut out some entries (synonyms) and compile them into a separate glossary or TMX (such as regex and segment patterns).

4. You want to delete duplicate entries that are not detected by CT (such as "FOB;free on board" and "free on board;FOB").

5. You are asked by your agency to deliver a project-specific glossary as well, but the agency does not use the CT-type glossary (synonyms delimited by a semicolon etc.).

Cheers,
Masato

Hey Massato!


>1. You want to use a CT glossary (with notes) for a CAT tool that does not detect synonyms separated by a semicolon (such as SDL Trados, OmegaT, and Wordfast).


Why would you. Ever. Use these other ones, I mean.

>5. You are asked by your agency to deliver a project-specific glossary as well, but the agency does not use the CT-type glossary (synonyms delimited by a semicolon etc.).


They should start using CafeTran ;). On the other hand: perhaps better not, because they will start asking for a server-based version with lots of PM features rapidly ;).

Hi Hans,

You are right! I use CT only. I just wanted to give a list of all possible use cases.

Thanks

 


It's copulating up a perfectly standardised and exchangeable format, like SDL does with XLIFF.


I hope this is insulting enough. If not, let me know, and I'll give it another try.


H.




Login to post a comment