Start a new topic

REQ: Tag = word separator

Yesterday I made some interesting discoveries:

This is what the target segments looks like.


I usually tend to put an additional space before the tag, to get a clean TM, but here it was not possible- "EINBLENDEN" is bold and underlined, so I would get an underlinde space.  And it is not a good practive to put an additional space behind the tag.

Please note the Hunspell behaviour.

Now this is what the "List words with unknown spelling" command puts out:


This is inconsistent, isn't it? Can this be resolved in CT or is it a Hunspell problem?

There is one more thing, now concerning Auto-Completion:


There was a tag in the former, respective target segment between Füllstand and the dots. Sure, I can react and choose entry #3, and sometimes leaving sentence marks ot any kind in AC (or at the end of AC entries, to speak more exactly) can have its pros, but here it nags somehow, doesn't it?

The third point is that the MT engines often do not recognize 2 words separated by a tag (and not by an additional space). This is something that mostly works in other tools.

Sure, there are cases where a file contains tons of tags inside the words. But this is what we have the docx/OCR filter for. Or would this festure "tag = word separator" break this filter?

1 Comment

> "tag = word separator"

Whether a tag can be treated as a word separator or not, it depends on the type of the tag. For example, bold formatting tag to bold part of the word should not be treated like this, while paragraph tag does have this space included in the tag attributes. You can examine the tag by the F3 shortcut and hovering the mouse over the tag number.

Login to post a comment