There might be situations where double entries (without differences in context, field or notes) in the glossary might appear:
The wishes (alternatively and perhaps as option, to not overload slower machines)
> double entries (without differences in context, field or notes) in the glossary might appear.
> There might be situations where double entries (without differences in context, field or notes) in the glossary might appear
They don't appear by themselves. Please avoid adding double entries thus creating the need for more complexity. Keep you terminology clean and simple, and then CafeTran interface will be kept clean and simple. Some general task to remove double entries might be added in the future.
> The problem
It's not a problem. It is a feature.
IK: ...Some general task to remove double entries might be added in the future.
Try the following:
1. Activate the Glossary interface via View > Show glossary menu.
2. Choose Glossary > Remove duplicate entries.
IK: Try the following
That won't work - I guess - in the case of e.g.
Torsten: Better would be to filter them out
Better would be to convert them to TMX, run the readily available applicable tasks, and leave them as TMX.
> And it won't work for
Of course, it won't. They are not duplicates.
> reasonable response, eg. the number of deleted entries
What's the reason for checking the number of deleted entries as they are gone anyway? Just the check for check's sake. :)
I think CTE creates a txt backup when you clean glossary duplicates, or alter the sort order.
Comparing the two text files with a visual diff tool (such as Meld) should be enough for the curious.
Some additional maintenance/filtering can be done by renaming the txt to csv and opening it in LibreOffice. Simply save back as tab delimited csv when done.
Sigh. I assumed we were talking about convenience. You can do anything with such tools, indeed. I can push my car 3 miles from A to B, but most drivers would expect to start the motor. Or more practical: You can use this tutorial to check your text, but you can also open the 300 URLs of 300 segments manually to work with and search in rahter raw JSON data (and perhaps, one day …).
And no, it's not about curiosity, it is more about control and security. Any database tool or any other program that offers the possibility to delete duplicates has (or should have) a kind of control. CT does not have this.
Sigh. You have easy deleting of duplicates. You have sorting where duplicates are visually close. You have lots of other cool glossary features in CT including the ability to open and manage your glossary in Excel or LibreOffice. Now, you expect a super-trooper duplicate tool within CT (let's say called "duplicator") with the enormous complexity where the user spends more time to manage duplicates than translating. Sure, this is possible to build an Excel-like interface so that the user could play with duplicates. But note that it may take more lines of code than the rest of CT. I am convinced the current approach of handling of glossary duplicates is optimal.
> The initial idea was to show both entries or to show an asterisc to indicate duplicates
This is distracting and takes your attention from the real task of translation. Instead, the current approach to treat double entries as one lets you simply ignore them and focus on what's essential.
Torsten: A very prominent Dutch user told me that he as more than one million entries.
However, the most prominent, smartest, most experienced, intelligent, nicest, and most modest Dutch translator counts far fewer entries in his TM for Fragments. And it makes sense:
"The statistics of English are astonishing. Of all the world's languages (which now number some 2,700), it is arguably the richest in vocabulary. The compendious lists about 500,000 words; and a further half-million technical and scientific terms remain uncatalogued. According to traditional estimates, neighboring German has a vocabulary of about 185,000 and French fewer than 100,000, including such Franglais as and ."