Hi all, whenever I create a project with a Word file having superscript/subscript characters, CT segments the source text directly before and after the superscript/subscript characters. I've attached 2 pictures, one showing the source text in Word, the other how the segmentation looks like. Is this normal or is something wrong with my settings? Of course I can join all the segments by pressing ALT + up, but anyway I am wondering if this can be improved? Many thanks in advance for helping!
Is "Segment at all tags" enabled in Edit > Preferences > General ? This could be causing the segmentation, since subscript characters are typically marked with tags in CafeTran.
Also, if you use a custom SRX, can you reproduce this with the built-in "Sentence" segmentation rule?
*subscript = superscript/subscript
Yes, the "Segment at all tags" checkmark was the cause, thank you for pointing this out. You are the best!!! Have a wonderful Christmas!