Hi guys and gals,
Is there any way to remove duplicates from Total Recall databases or tables?
>Is there any way to remove duplicates from Total Recall databases or tables?
Only: Export everything. Load as a memory, perform Tasks to remove duplicates. But hey, does it really matter, considering the prices of disk space in military grade laptop tanks.
>I could also open the .db in TMLookup, and use TMLookup's duplicate removal tool on it.
I kind of remember that Igor said that the removal of duplicates wasn't so simple. So, perhaps Farkas has found the hen with the golden eggs? Something that he'd be willing to share? Give some inspirational hints? I know that you have the best relations with him, hint, hint.
Hello Michael, I think the safest bet would be to run maintenance tasks on the recalled TMX (or simply create a new TR table).
Keeping TMs well organized helps you Join them ad hoc if you need it or rebuild your TR tables anytime with no fuss.
I guess you keep separate TRs for your own TMs and EU DGT, Opus stuff etc. or do you usee one big fat db to rule them all?
idim: I think the safest bet would be to run maintenance tasks on the recalled TMX
How very true. Not only the safest, but also the fastest solution. In fact, I suggested it when a "Japanese user" ended up with more TR TM segments than the the number of segments in the table. Not sure if a TR TM is useful for Japanese as a ST, though.
Using TMLookup might be a problem. I never used TML, but it seems it creates a .db for each and every table. Don't know what happens if you open a CT DB with multiple tables in TML. You can of course open the table in a SQLite browser, and execute an SQL command. For more information, see Lenting. He knows everything.
idim: Keeping TMs well organized helps you Join them ad hoc if you need it or rebuild your TR tables anytime with no fuss.
If you use TR as His Igorness intends it to be used (I don't), that is, adding TMs all the time, after each and every job, duplicates will be inevitable. And that's no problem, in view of the solution you yourself provided above: It's the TR TM that counts.
It would take ages for very large SQL databases to analyze their rows for duplicates. However, there is no harm in keeping such duplicates in terms of performance, because they are eliminated by default during the recall to the working translation memory.
IK: ...because they are eliminated by default during the recall to the working translation memory.
That didn't seem to be the case with the Japanese user I mentioned above who rather recently ended up with more segments in the TR TM than in the original table. Please explain..
The Total Recall progress bar (in Windows look and feel showing the numbers) was misleading a few updates before. It showed the progress of TR analyzed segments - not the number of actual segments loaded. Or the user may have changed the TR option to keep all the duplicates.