Start a new topic

Resegment XLIFF from paragraph to sentence

From: https://groups.yahoo.com/neo/groups/okapitools/conversations/messages/5412


You'll need to download the tool: http://okapi.sourceforge.net/downloads.html


Hi,

 

Thanks for the files.

If your goal is to segment the input XLIFF file you normally need a simple pipeline that: 1) reads the file, 2) segment the source, 3) write back the file.

There is no need to create a translation kit or leverage or any other things.

The pipeline you are using creates a T-Kit, so that is why you have a .xlf.xlf output with different XLIFF data: you have extracted your XLIFF file into the XLIFF file of the T-Kit.

 

In your case you just need a pipeline with:

 

1) RawDocumentToFilterEvents

2) Segmentation

3) FilterEventsToRawDocument

 

In the segmentation options:

- Specify to segment the source and select the SRX file to use.

- The other defaults are likely fine

 

Make sure your languages and encoding are set properly and you can execute that pipeline.

 

The *.out.xlf files created should have your entries segmented.

 

Note that entries with that have a single segment are by default output without segment markers.

This is because XLIFF doesn’t have a way to indicate if an entry with a single segment has been segmented or not.

 

I’ve attached the output as example.

I hope this helps.

-yves

 

Login to post a comment