[Corpora-List] Clean Enron Anyone?
peet.morris at comlab.ox.ac.uk
Fri Mar 18 18:20:00 CET 2005
I'm wondering whether anyone has a 'cleaned' version of the Enron email
In its raw state, most of the emails contain routing-headers, footers, and
disclaimers etc - plus, IMHO, some of the emails are spam.
If no one has a cleaned up version, I am going to attempt the clean up
myself - so, if anyone's interested in getting the output of that effort,
please let me know.
Have a nice weekend,
More information about the Corpora-archive