I may soon have a large HTML translation job coming up, and before I start making any promises to the client I wanted to check how CT handles HTML files.
Does CT play nice with HTML files? Are there any particular pitfalls to watch out for?
Perhaps you're not happy with this suggestion, but whenever I find that CafeTran's file format filters aren't up to my expectations, I use another CAT tool (like Studio or Transit) to create a file that I then translate in CafeTran. I do this for FrameMaker MIF.
Don't know if it is too late for you, but I do translate html files with some php code in them. I find WordFast 4 (which I try to avoid most of the time) handles them better. CT does not segment the alt text. So that you have to open the file with BBedit or the likes and translate those snippets manually. They are usually short. But if you have lots of them, it is a pain. I don't like segmenting "elsewhere" and bringing it back to CT as you can't preview the document in CT (or am I missing something?).
Yes, missing bilingual preview is a bummer... Adding/improving file filters would be a major improvement, although maybe not the most rewarding developper activity. Igor would have to brew several cups of of his cafetran coffee!
About to embark in a fairly big html project. Any chance the html segmentation rules will see an improvement and include sub-flow items?
Oh and the following code (only a snippet) is segmented into a single huge segment with the list of items separated by tags. For long tables of contents (where titles have to be consistently used throughout the doc), it is a pain to have to split the segment into dozens of small segments to be used later.
Would it be terribly hard to tell CT to segment each list item?
<li><a href="#page_3-0-0">Creating the Cone of Uncertainty</a> <ul class="nav" id="ul_3-0-0"> <li><a href="#page_3-1-0">Using the Cone of Uncertainty</a></li> <li><a href="#page_3-2-0">One to Five-Day Track Errors</a></li> <li><a href="#page_3-3-0">Two-Day Track Scenarios</a></li> </ul> </li>
>it is a pain to have to split the segment into dozens of small segments to be used later.
Personally, I'd be very careful with splitting and joining. To avoid export problems later.
Perhaps you'd like to use the Okapi HTML filter:
Igor, how about adding support for these filters? Wouldn't that be a more effective approach?
> Igor, how about adding support for these filters? Wouldn't that be a more effective approach?
Nope. I've never seen OKAPI. As with every CT feature, I wish to improve CafeTran's html filter gradually but I don't really know when it will be accomplished. I can succeed next week, next month, next year or tomorrow. I generally have no idea when something can be completed.
>Personally, I'd be very careful with splitting and joining. To avoid export problems later.
Hmm, yeah. Maybe I can just translate the whole bunch and then commit each subsegment to memory to be retrieved later. Okapi! I think I'm to old to learn new tricks ;-) but that's worth a try.
> I wish to improve CafeTran's html filter gradually... next week, next month, next year or tomorrow
Thanks, it's still good to know html filter improvement is in the pipe.
> Would it be terribly hard to tell CT to segment each list item?
Okay, the above has just been fixed for the update 3 (build 2017031401). Please download and install the update again. You will need to start a new project with the changed html filter.
>I wish to improve CafeTran's html filter gradually
Wouldn't a regular expressions tagger be a more flexible solution/approach/hans_happy_maker?