Start a new topic

REQ Unstructured txt ressources

Hi,


When translating from Japanese to French, quite often I have to search for technical terms in JP-EN glossaries first, then in EN-FR glossaries.


I tried to import all my JP-EN and EN-FR glossaries into a CT glossary, even though they don't have the same configuration (some have only 1 column, some have 2, 3, 4 or 5, all with different delimiters. It worked to a certain extent, but not as expected (some entries are ignored).


What would be great is the possibility to create a REFERENCE text (txt file) from different glossaries, and to make CT search inside it (like a grep search in Linux and Mac), then display all the lines that contain the searched string, whatever the format of the line (with or without tabs or other delimiters).


Possible?


1 person likes this idea

Alain: What would be great is the possibility to create a REFERENCE text


I don't know, but what if you load all relevant files in a single database file? Then, it wouldn't be included in the Automated Workflow/AA, and searching is blistering fast.


H.

Hi Woorden,

How can I create a database file for unstructured txt files ?
I mean, I will have to define some structure (source, target...)
Or am I missing something?

Alain: I will have to define some structure (source, target...)


Yes, but you can "structure" it yourself, and for a table that's not consulted by the Automated Workflow anyway, it doesn't matter if that "structure" doesn't correspond to real life. So you can add your JP-EN and EN-FR glossaries including extra columns to that table without having to worry if it's correct, as long as everything is there. Methinks.


H.

Thanks Woorden, I tried (failed) and will try again, but this Total Recall thing is not really easy to use. I always get all kinds of problems, both on Mac and Windows, and tend to stay away from it.


A simple Reference search function would be much simpler in my case, or at least the possibility to create a database with only one column for "plain searches".

Alsain: A simple Reference search function would be much simpler in my case, or at least the possibility to create a database with only one column for "plain searches"


Pretty easy to achieve in a database table, I think. Anyway, I'll suggest anything to stop Igor paying even more attention - and waste time - on those silly tab dells.


H.

Hi Alain,


You should be able to achieve it with the current glossary feature as follows:


1. Select "Glossary folder" in the Glossary menu.

2. Choose "Add glossary..." from the Glossary menu and select a folder with the files.

3. After the import and loading, try to do some search via the Search bar (Glossary button)


Note: This is an in-memory solution so your RAM is the limit of how many files you can load.


Igor

Hi Igor,
I gave it a try. I got some search results, but CT seems to search only in the first part of each line (the part after a delimiter seems to be ignored, even if I select Left and Right terms in CT's Glossary Preferences).

One question : When I add a folder with the above method, does it include sub-folders too?

 

In the meantime I installed ag.exe (Silver Searcher) on Windows, then I did this in CT's Desktop Search Tool:

ag.exe -i {} C:\folder path to my txt glossaries

It's VERY fast (much faster than Grep), I get dozen of instant results.
The only problem is that I cannot search for Japanese terms. For English and French it's perfect.

Well, the glossary interface recognizes the fields (if the have the tab delimiter) based on the header. Apparently, your glossaries are a mixture of files with and without a tab delimiter, and also with or without a header. In this case, CafeTran searches on the source side of the entry when there is a delimiter and the whole line when it 'sees' no delimiter. I might change it to search both source and the target part of the entry in so miscellaneous files for the next update.


One question : When I add a folder with the above method, does it include sub-folders too?


The subfolders are not included.


> ag.exe -i {} C:\folder path to my txt glossaries


Great! I was just about to suggest CafeTran's Desktop Search Tool for the above scenario. 

Thanks Igor, searching both source and target in the next update would be great!

---
Rectification for ag.exe. Works fine for search of English strings. I verified on the developers' site: there are problem with special characters (éàô and so on, Chinese, Japanese...)

 

Login to post a comment