Start a new topic

Looking for large bilingual patent-related data collections

Hello fellow CafeTranslators! It's me again.

  

I'm looking for large collections of bilingual data involving patent-related material in the following language pairs:


Asian languages: Chinese, Korean, Japanese, Thai, etc.) 


and 


European languages: English, French, German, Italian, Spanish, etc.


Does anyone here know where I can find anything freely available online?


In case you're wondering, it's for a client who is trying to develop a custom MT engine specialised in patents.


Michael


Hi Hans,


I have no idea what they are going to do with the data, or how they are going to use it; I'm paid only to locate/align/convert/supply it.


Michael

There are many. Please see the attached file.


M,

zip

Japanese for this

zip

PCT in English and Japanese

zip
(7.44 MB)

Hi Masato,


Think you forgot to attach the file!


Michael

Wow, thanks for all that! 


I obviously don't speak Japanese (or many of the other languages that I am looking for data in), so I won't be able to check any of the quality of these, but there ought to be some useful stuff between the various links and things you posted.


Michael

Open Japanese-English translations are rarely to my taste, but may be useful for others.

=Renewed=


Online resources:


Japanese law translation (including patent law)

http://www.japaneselawtranslation.go.jp/?re=02


 * This data can also be found in "Honyaku Star"


Japanese law translation memory (downloadable tmx file of the above translations; but contains errors...)

http://itrd.crestec.co.jp/transmemoryweb/


Japanese Patent Office

https://www.jpo.go.jp/index.htm


Unless I've completely misunderstood Tom's webinar on personalized MT systems, I think that feeding an MT system with such large collections on an almost infinite number of subjects, won't lead to good results. Let's not forget that patents can refer to almost anything. Okay, you could separate on subject fields, but how many man years will that cost?// Is the client starting up this activity of patent translation? If so, why trying to go so broad?// If she's already in the business of patent translation, why not use her own TMX files?// Only these will lead to hig-value, personalised MT systems.
Login to post a comment