[Corpora-List] Source code corpora

Klaus Guenther klaus.guenther at split.uni-bamberg.de
Thu Nov 20 20:29:56 CET 2008

Many modern projects provide API documentation that is automatically generated. This would seem to be nearly a reverse of the mapping you desire to study. Such projects include phpDocumentor that use blocks of comments within the code (phpdoc, javadoc, etc.).

If you are particularly interested in system requirements, I am not quite sure how you could map this to the code itself, unless compiler optimisation is taken into account. Nonetheless, that is mainly automated.

Regards, Klaus

-- Klaus Guenther English Linguistics University of Bamberg

Eric Atwell schrieb:
> I also seek source code corpora, but with English-language requirements
> specifications accompanying each program; for a PhD project on mapping
> from English specification to formalism and/or code. Any pointers welcome.
> Eric Atwell, School of Computing, University of Leeds
> On Thu, 20 Nov 2008, sdb at cs.rmit.edu.au wrote:
>> Dear colleages,
>> My research relates to authorship attribution of source code (that is,
>> determining the owner of anonymous work samples based upon other work
>> samples where authors are known).
>> I'm looking for recommendations for source code corpora for this task
>> for any programming language. For the corpora to be useful, authorship
>> has to be identified.
>> My work to date has involved student programming assignments, and I'm
>> now interested in other sources such as industry and open-source
>> projects.
>> Many thanks,
>> ---------------------------------------------------
>> Steven Burrows
>> PhD Candidate, Sessional Lecturer
>> School of Computer Science & Information Technology
>> RMIT University
>> GPO Box 2476V, Melbourne VIC 3001, Australia
>> o: 14.09.04
>> p: +(61 3) 9925 2758
>> f: +(61 3) 9662 1617
>> e: steven.burrows at rmit.edu.au
>> w: www.cs.rmit.edu.au/~sdb
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

More information about the Corpora mailing list