Start a new topic

Unnecessary tag recognition in OOXML (docx)

Hi!

I'm translating a bunch of documents (OOXML/docx) of a web page content which contain some tags as a text here and there <br>, <b>, <h1>, and others.

I've got a problem where CafeTran would recognize <b></b> tags as bold in the target segment (they show as a simple text in the source segment).
– So if the source goes like: "<b>some untranslated text</b>"

– The target gets converted into "bsome translated textb". The output document, of course, contains simply bold text without <b></b>.

– Whereas I would expect CafeTran to simply leave these <b></b> tags as a text without any bold text: "<b>some translated text</b>".

Other "tags" (line breaks, headings) seem to be fine, I haven't encountered any italic/underline fake tags.
Is there anything I can do (some settings or other way to work it out) inside the CafeTran?


If you are translating the MsWord document or LibreOffice document, the html formatting tags for bold, italics and underline are converted automatically into the corresponding Word or LibreOffice format. An option to skip the conversion might be introduced in a future update.

Thanks for the information! I'll keep track and undo the changes manually in the target document.

The option to turn off the conversion of html formatting tags at export is available in the latest CafeTran update 10.4.1


1 person likes this

 Wow—I'm a little speechless of how fast you implemented it! Thank you very much!

Login to post a comment