[Corpora-List] Source code corpora

Darren Pearce darren.pearce at gmail.com
Thu Nov 20 18:50:52 CET 2008


Not forgetting Google Project Hosting as well ( http://code.google.com/hosting/). :-) On Thu, Nov 20, 2008 at 5:09 PM, Alexandre Rafalovitch <arafalov at gmail.com>wrote:


> Wouldn't any source code repository with version control system give
> you that automatically? They all tell you exactly which code was
> contributed and by whom.
>
> E.g. SourceForge, Apache or Linux Kernel collections.
>
> http://www.koders.com/ might be a good way to search, if you are
> trying to narrow down to a particular area.
>
> Regards,
> Alex.
> Personal blog: http://blog.outerthoughts.com/
> Research group: http://www.clt.mq.edu.au/Research/
>
>
>
> On Thu, Nov 20, 2008 at 1:28 AM, <sdb at cs.rmit.edu.au> wrote:
> > Dear colleages,
> >
> > My research relates to authorship attribution of source code (that is,
> > determining the owner of anonymous work samples based upon other work
> > samples where authors are known).
> >
> > I'm looking for recommendations for source code corpora for this task
> > for any programming language. For the corpora to be useful, authorship
> > has to be identified.
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>

-- ----------------------------------------------------------------------

:Darren :Pearce ----------------------------------------------------------------------

*** Shop & Donate: http://buy.at/campuskids *** ----------------------------------------------------------------------

darrenp at dcs.bbk.ac.uk

Postdoctoral Researcher

London Knowledge Lab, University of London ----------------------------------------------------------------------

darrenp at sussex.ac.uk

Visiting Research Fellow

Informatics, University of Sussex

http://www.informatics.sussex.ac.uk/users/darrenp/ ----------------------------------------------------------------------

darren.pearce at gmail.com

http://www.linkedin.com/in/darrenpearce ---------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3190 bytes Desc: not available Url : https://mailman.uib.no/public/corpora/attachments/20081120/2684601f/attachment.txt



More information about the Corpora mailing list