[Corpora-List] Survey: applications using grammar-based parsers

Mike Maxwell maxwell
Wed Apr 1 00:46:17 CEST 2009

Michael Piotrowski wrote:
> Unfortunately, it seems that when you do not just want the performance
> numbers (cited in papers) but the actual, working system, it frequently
> turns out that it is not available (dead project, results locked away,
> commercial, etc.) or unusable in an application (too slow, not
> embeddable, etc.). At least our experience wrt. morphologic analysis
> and generation for German has been quite sobering.

Michael (the poster cited above, not me :-)) brings up an important point--or at least it's one I've been harping on for several years now.

A lot of work goes into a parser (the ones we've looked at have all been morphological parsers), and five or ten years later the parser is no longer available, or if it is available, it won't run because the software it runs on has been changed from underneath it. (I once worked on a project where two of the three programming language implementations we used were obsolete or defunct before the project even finished.)

There's probably not much that can be done about the commercial case that Michael mentions. But something can be done about the 'dead project' and 'results locked away' issue, as well as the software obsolescence issue. Open sourcing is part of the solution, but IMO only part.

Anyway, I would appreciate hearing stories about this kind of problem--or pointers to papers that mention the problem. Write to me and I can summarize, or if it's of general interest, post to the list. --

Michael Maxwell

What good is a universe without somebody around to look at it?

--Robert Dicke, Princeton physicist

More information about the Corpora mailing list