Have ₂ recognised as <tag>2<tag>

Would it be possible to have a glossary entry with ₂ recognised as <tag>2<tag>?

Currently there is no way to have terms like CO<tag>2<tag>-value recognised.

Entering terms like CO₂-value in the glossary would be very intuitive.

Good news!

I've found a way to have terms like CO₂-value auto-assembled correctly, in those cases where the author of the source document didn't use the Unicode ₂ but a normal 2 in subscript.

The trick is to use the figure dash (U+2012) instead of the normal dash.

Of course, the value of sentences with the Unicode ₂ and a normal 2 in subscript is different, so you won't get EMs. But the surrounding tags for the normal 2 in subscript are placed correctly.

I'm quite happy with this discovery, since I translate a lot about CO₂.

The yellow marked sentences in the MS Word export document are with the Unicode ₂.



BTW: If you want to replace the normal dash with the figure dash before lowercase letters, you can use this regular expression in BBEdit:


In CafeTran Espresso you'd have to use $1‒$3.

