[Corpora-List] Difference in POS tag distribution in different genres

Trevor Jenkins trevor.jenkins at suneidesis.com
Tue Dec 18 12:48:12 CET 2012


On 18 Dec 2012, at 11:12, Phil Gooch <philgooch at gmail.com> wrote:


> I've done a bit of work on analysis of the distribution of pronouns in clinical narratives (discharge summaries, progress notes and lab reports), and how this can help with protagonist identification and coreference resolution. I don't know if this is of interest to you, but I can point you to a paper and the relevant chapter of my PhD thesis if you'd like to know more.

If one of the languages involved is signed then I'd be interested.


>
> Phil
>
>
> On Mon, Dec 17, 2012 at 2:52 PM, Trevor Jenkins <trevor.jenkins at suneidesis.com> wrote:
> On 17 Dec 2012, at 03:24, Adam Kilgarriff <adam at lexmasterclass.com> wrote:
>
> > > more proper nouns in news paper text than in fiction
> >
> > certainly true. In general, the more formal/informational a text is, the more nominal, with more nouns, adjs/determiners; the more informal/interactional, the more verbs and pronouns. Fiction and newspaper are noteworthy for past tenses and 3rd-person pronouns.
>
> Interesting but does that hold for all other languages? For example, signed languages and specifically British Sign Language. There are several signed language corpora now do those support this assertion? And what about those written languages, like BSL, that do not have a tense system but use time markers instead?
>
> Regards, Trevor.
>
> <>< Re: deemed!
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

Regards, Trevor.

<>< Re: deemed!

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3115 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121218/e079d22a/attachment.txt>



More information about the Corpora mailing list