In this case, aligning problems can only occur if the punctuation of the TL differs from the punctuation of the SL. Unlikely, but it can happen, and you'll find out soon enough. Regexes can - and therefore will - make havoc of the TM, and they'll suffer from exactly the same problem.
Simply changing a segmentation setting in the TMX file.
In that case, our troubles are over. I doubt if it works.
H.
BTW: The solution suggested for oT (in the oT forum) is nice too. Simply changing a segmentation setting in the TMX file.
>My problem at the moment is, that I can't seem to avoid aligning.
That's what I wrote to you some days ago.
>It's not a real problem at this stage
It actually is a big problem. At least for me. Every aligning is introducing new possible errors. I have the same feelings towards aligning as you have towards regular expressions :).
Lenting: The iPhones aren't matched because of the ) after 'surprise'
I know, that's why I put the ) in.
I still think it's a good idea to let CT do the segmentation: You wouldn't need a very complex regex, and it would be the same segmentation as the source document shows, if it's a CT project. If not, you'll have AA insertable fragments. So forget about those regexes. My problem at the moment is, that I can't seem to avoid aligning. It's not a real problem at this stage, but I want to reduce steps.
H.
Duh! A double escape is needed for the ):
(?<=[a-z%\d\\)][\.\!\?]) (?=([a-z]?[A-Z]))
I cannot get the closing parenthesis to the first matching expression.
Expression:
(?<=[a-z%\d][\.\!\?]) (?=([a-z]?[A-Z]))
Result:
The iPhones aren't matched because of the ) after 'surprise'. Not good.
Lenting: Next time, please test before you post
I did. Got mixed up with all the test files on my desktop.
Okay, I'll concentrate on the regular expressions then
Don't. Too complicated. My original solution still should work, with CT doing the segmentation, but it would require aligning, and I think that wouldn't be necessary.
H.
Haha, I was just testing your procedure :). Next time, please test before you post. Think before you speak. Or even better: don't speak.
(Just joking here, like you always are. At least, I cannot imagine that you mean all the mean things that you write here and elsewhere. Okay, I'll concentrate on the regular expressions then.)
For what it's worth: here's the Word document that you instructed.
Forget the above. I still think it's possible - No regexes, no aligning, no nothingness required - but it seems I forgot a step or two.
H.
It looks like it's all much easier.
I imported a .docx of the above (plus blabla) in CT. Perfect!
H.
Lenting: All pretty straight forward
I doubt it. And I'm quite sure there will be many other "exceptions." Before His Igorness came up with his solution, I tried to write a regex myself. I used a short, real-life text (an email by my baby-sister who will turn 55 next Monday). That already showed one of the above problems. I admit I wrote the text for Regexr screenshot above myself.
...how about letting CafeTrans using its segmentation rules here?
That may or may not be possible. Unless, of course, you use my "solution," the one I provided on ProZ and here. That would most certainly be possible, and aligning shouldn't be a problem, especially not in this case.
H.
HL