Memory Maintenance

I just realised that the solution to check translation consistency can easily be adapted to create a solution for Memory Maintenance: removing all punctuation and spaces, resetting all numbers to 0, removing duplicates, to compact large TMs. That is: if you can live with CafeTran adjusting the numbers upon insertion in the target segment and are willing to add the punctuation marks yourself.

This will be my next project.

This would be a nice way to remove duplicates from TMs, possibly (hopefully) resulting in even better (more consistent) translations.

Attached is an AppleScript to clean up TMX files by removing identical segments that only differ in punctuation, spaces and numbers (while keeping one unique segment pair for every set of matches). Please note that punctuation is removed from the resulting TMX file. All numbers are set to 0 (normally your CAT tool will insert the correct numbers while insertion fuzzy matches). Segments pair where source=target and segment pairs that don't contain any letters are removed too.

Further removals can be added at request.


The script shows how to parse and process a TMX file and later rebuild it, using AppleScript.

Hi Hans,

As scripting is not a CT feature, such topics that involve AppleScript or other external scripting might be created in General Technical Topics section in the future? I would rather this section focus on pure CafeTran functionality. 


