Would it be possible to have a glossary entry with ₂ recognised as <tag>2<tag>?
Currently there is no way to have terms like CO<tag>2<tag>-value recognised.
Entering terms like CO₂-value in the glossary would be very intuitive.
I've found a way to have terms like CO₂-value auto-assembled correctly, in those cases where the author of the source document didn't use the Unicode ₂ but a normal 2 in subscript.
The trick is to use the figure dash (U+2012) instead of the normal dash.
Of course, the value of sentences with the Unicode ₂ and a normal 2 in subscript is different, so you won't get EMs. But the surrounding tags for the normal 2 in subscript are placed correctly.
I'm quite happy with this discovery, since I translate a lot about CO₂.
The yellow marked sentences in the MS Word export document are with the Unicode ₂.
BTW: If you want to replace the normal dash with the figure dash before lowercase letters, you can use this regular expression in BBEdit:
In CafeTran Espresso you'd have to use $1‒$3.