[Corpora-List] automatic search for orthographic recurring patterns
argamon at iit.edu
Wed Dec 8 18:00:00 CET 2004
See our paper in COLING-04:
Shlomo Argamon, Navot Akiva, Amihood Amir, and Oren Kapah.
Efficient Unsupervised Recursive Word Segmentation Using Minimum
Proceedings of The 20th International Conference on Computational
Linguistics (COLING), August 2004.
Available at http://lingcog.iit.edu/pub.xml
MARC FRYD wrote:
> Perhaps someone on the List will be able to help me with the following
> datamining problem:
> Given a corpus of isolated lexical units or collocations, I would like
> to determine recurring orthographic patterns whether initial, i.e.
> "CARPO" (carpogenic, carpogenous, carpolite), final i.e. "IONALISM"
> (sensationalism, functionalism, etc.) , or internal, i.e. "CHRON"
> (synchony, synchronize, etc.).
> The output should be arranged so as to show respective productivity for
> each pattern.
> Important constraint: the various patterns will *not* be fed in
> initially but should be extracted as a result of the algorithm.
> I'll post a summary if I get several replies.
> Regards to all list members.
> Marc Fryd
More information about the Corpora-archive