Start a new topic

Converting an Excel termbase with synonyms

I need to convert my 3-language CTE termbase to two simple, bilingual and tabbed Excel files.


But, many terms include semicolon-separated synonyms. Does Excel offer a way to automatically split the rows containing synonyms into separate rows for each item?


Not that I’m aware of. You can do this manually in Excel or use a script for a text file.
On second thought: the manual steps could be automate in a vba macro. Complex to develop. Not interesting for a one-time conversion task.

I've tried a lot with Excel, but I couldn't find a way accessible to me. Silly me that I built my 3-language glossary that way! No problem, I'll do it manually and probably the new glossaries will be ready in one or two months time.

Did you use semicolons in the source terms too?
You can ignore that question

Ignored. However, yes, I used semicolons in the source terms too, and that was my biggest mistake.

In discovered that too.

image


Copy a column to an editor. Replace ';' with tab. Paste the result into a column of a new spreadsheet:


image


Sort on C, B (in case of maximal 2 synonyms, adapt if necessary).

Take out and store the rows that don't have synonyms in any language (only 3 columns).

Then you'll have to copy and delete columns a lot, to get:


appel Apfel

peer Birne

appel apple

peer pear

etc.


In case you want to use javascript:


 

(() => {
    "use strict";

    // termPairs :: String -> String
    const termPairs = s =>
        lines(s).flatMap(x => {
            const [l, r] = x.split("\t");

            return l.split(";").map(
                k => `${k}\t${r}`
            );
        })
        .join("\n");


    // ---------------------- TEST -----------------------
    const main = () =>
        termPairs(
            Application("Keyboard Maestro Engine")
            .getvariable("termSample")
        );

    // --------------------- GENERIC ---------------------

    // lines :: String -> [String]
    const lines = s =>
        // A list of strings derived from a single
        // string delimited by newline and or CR.
        0 < s.length ? (
            s.split(/[\r\n]+/u)
        ) : [];


    return main();
})();

 And for the target-side:

 

(() => {
    "use strict";

    const main = () => {
        const
            kme = Application("Keyboard Maestro Engine"),
            source = kme.getvariable("termSample");

        return lines(source).flatMap(x => {
            const [l, r] = x.split(/\t/u);

            return r.split(/;/u)
                .reverse()
                .map(
                    k => `${l}\t${k}`
                );
        })
        .join("\n");
    };

    // --------------------- GENERIC ---------------------

    // lines :: String -> [String]
    const lines = s =>
        // A list of strings derived from a single
        // string delimited by newline and or CR.
        0 < s.length ? (
            s.split(/[\r\n]+/u)
        ) : [];

    return main();
})();

 

Do you wanna quit 3-colums glossary completely? I don't see any problems with these, as long as you build them concept-based.


Stuhl - chair - stoel

Stuhl - ontlasting - feces

Wow, many thanks! I'll try tomorrow (not the Javascript solution, sorry).


No, I'm not quitting the 3-language glossary, although I long discovered that it's not as useful as I thought. The fact is, my other CAT tool doesn't accept 3-language glossaries.

Good luck!


You'll probably also want to save the two columns target_language_1 and target_language_2.

Mario: I need to convert my 3-language CTE termbase to two simple, bilingual and tabbed Excel files.


I think Ron's Editor can help with that.


Open the CSV or TSV in the program.


Select the first language column and apply this Row > Split action (see screenshot).


Split, then select the second language column and do the same. Rinse and repeat for the third column.



1 person likes this

Beautiful! Many thanks Jean. I've only used RonsEditor for the basics so far, but this is going to save me a lot of time.


Cheers


Mario

Will this work when cells don’t have an equal number of synonyms? E.g.: Japanese: 2, Italian: 3, English: 1.
Login to post a comment