[Corpora-List] efficient decision tree tool?

Andy Roberts andyr at comp.leeds.ac.uk
Thu Jan 19 11:23:00 CET 2006

I expect Ross Quinlan's C4.5 will be adequate then (which is what J4.8
is based on).

You can get it from http://www.rulequest.com/Personal/


On Thu, 19 Jan 2006, Caren Brinckmann wrote:

> Dear all,


> we are currently working on corpus-based models of duration, F0, intensity,

> and segmental reductions in read and spontaneous speech. For the first part

> of our study we will use decision trees.


> Since our database is fairly large, I am looking for an efficient decision

> tree tool with the following features:


> * nominal and numeric input features and predictees (classification and

> regression trees)

> * binary as well as multi-way splits

> * efficient handling of large datasets (200,000 cases/records/instances with

> up to 100 attributes/features/variables)

> * nice to have: integrated feature selection algorithm


> In previous studies, I've worked with "wagon" from the Edinburgh Speech Tools

> Library (http://www.cstr.ed.ac.uk/projects/speech_tools/) and "J48" from Weka

> (http://www.cs.waikato.ac.nz/ml/weka/). While wagon is very fast and

> memory-efficient, it only allows binary splits (as far as I know). Weka

> allows multi-way splits, but is too slow and memory-consuming for our current

> datasets.


> I'm looking forward to your suggestions!


> Kind regards,


> Caren.


> P.S.: If you know any other mailing list or forum where I could post my

> question, please let me know.


> --

> Caren Brinckmann

> Saarland University, FR 4.7 Institute of Phonetics

> P.O.Box 151150, 66041 Saarbruecken, Germany

> Phone: +49-681-3024244, Fax: +49-681-3024684



More information about the Corpora-archive mailing list