The Author profiling task is concerned with predicting an author's demographics from her writing. Besides being personally identifiable, an author's style may also reveal her age and gender. Accurate predictors are of key interest to forensic linguists and marketers alike. Participants will be provided with with a training data set that consists of documents written in both English and, for those interested, in Spanish too. With regard to age, we will consider posts of three classes: 10s (13-17), 20s (23-27), and 30s (33-47). Moreover, documents from authors who pretend to be minors will be included (e.g., documents composed of chat lines of sexual predators will be also considered).
The training corpus is already available: more info on this task and the other two tasks of PAN (Plagiarism detection and Author identification) at: http://pan.webis.de/ Info about the main CLEF-2013 conference: http://www.clef2013.org/