We have just started a project here at the Radboud University of Nijmegen that deals with Passage Retrieval and Text Mining in patent texts. I was wondering if anyone could point me to some literature/research/interesting facts on the linguistic and statistical characteristics of the language used in patent texts (e.g. frequency and hierarchical organisation of PP-attachments, use of gerund clauses vs. the relative clause with an inflected verb, average sentence length in the different sections, ... ).
I will of course post a summary of your replies on this list.
Thank you ever so much!
Eva D'hondt, PhD student Centre for Language and Speech Technology University of Nijmegen Email: e.dhondt at let.ru.nl -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 824 bytes Desc: not available Url : https://mailman.uib.no/public/corpora/attachments/20090226/a84540db/attachment.txt