T RE: the most frequent word is "the" with about 1300 occurrences
One of the reasons I prefer AntConc to create a frequent words list: It can "deduct" stopwords (and create n-grams, lists of keywords, and a lot more). You can find a collection of stop words in 29 languages here.
Advantages: Cross-platform, free, CT independent, lots of lists
Disadvantages: Document must be converted to plain text, leverage against resources requires importing in CT
H.
tre
I recently started a 16 K words project, and I tried to get the frequent words. After starting the command nothing happens, only a tab is opening. Nothing else reacts. Then, after about one or two minutes, the words are visible, but CafeTran keeps being frozen (to have an idea, the most frequent word is "the" with about 1300 occurrences, I still did not fine-tune the process) for several minutes and I need to kill the process.
I assume there are some people who think that progress bars are sth out of hell, but in this case – as for the pre-translation and some other features – it would be nice to have a progress bar and/or a bar saying "Frequent words are still being processed – please wait" or something similar.