[Corpora-List] copyright issues

Francis Tyers ftyers at prompsit.com
Fri Feb 27 14:15:16 CET 2009

El vie, 27-02-2009 a las 13:21 +0100, Christian Chiarcos escribió:

> The problem is even worse, because it is not entirely clear what counts as
> a derived work (annotations ? statistical models trained on these ?), and
> to what degree the copyright owner of the original text also receives a
> copyright on the derived work. If the corpus data is problematic in its
> copyright, then derived works may be problematic as well.
> At least for this reason, it's safer to ask for a written agreement from
> the publisher stating explicitly what you're allowed to do with the data.
> The only legal alternative is to restrict your corpora to illustrative
> examples, i.e., to use at most a fraction (e.g., <=15% per document as a
> rule of thumb) of the original text.
> But even this practice does not guarantee full legal security unless it is
> confirmed by some kind of verdict.

Or, taking us back to the beginning of the thread, even safer to annotate texts which don't have these restrictions :)


More information about the Corpora mailing list