I am translating html and php files. Which segmentation options should I pick in the "preferences" dialog. If I pick "Sentence", the segmentation rules for these types leave some texts out or do not segment like I would like them to do.
For example, a table of contents or a list separated by the li /li html tags is just bunched up into one big segment with list items separated by CT's red number tags. Splitting a long table of contents is not productive and leaving it like this prevents me from accessing the section titles individually when I get to them in the rest of the file.
Also, some segments are left out and I have to open the file in BBEdit to get to them after import.
For example "Cyclone model" is not segmented and stays hidden in :
<div class="center"><img src="media/graphics/model.jpg" alt="Cyclone Model" /> </div>
So if anyone can point me into the right direction regarding segmentation rules for html/php files and how to apply them to CT, it would be much appreciated.
Thanks for your insight.
Thanks a lot for the links. Okapi may be a solution. But even with the lastest Java installed on my Mac (El Cap), Rainbow complains it need 1.7 or higher! Weird thing though... System prefs (and Java.com) say 8 is installed, but typing "java -version" in terminal says it's version 6???
Anyway, if someone has a solution that involves only CafeTran, I'm all for it. Or if any Mac users know about the version discrepancy... Googling java discrepancy returned useless answers.
Duh... a restart took care of the Java discrepancy! And the apps are launching fine. Now on to trying Okapi.
But if Okapi can do html/php files properly, surely there is a way of telling CT to segment those files as well.
Did you manage to start working with the Okapi framework?
I successfully went through the example that proves the installation works and I looked into Ratel and segmentation rules basics. But then I had to work!
However I saw that there is an option in Ratel called "sub-flow" which seems to pick up the text in the "alt" tag. But I haven't been able to tell Ratel to split segments at the li and /li tags...
I'll go back to Rainbow later and try to figure it out. I'll come back here when I have gotten further.