[Corpora-List] Corpus Development

Bob Parks bobp at clarityconnect.com
Mon Apr 28 16:22:05 CEST 2008


Serge and Mark, I'd be interested in hearing more about your views on the differences - pro and con - between relational DB architecture and full text DB architectures. Both of these must also deal take into account the characteristics of different programming languages. And both approaches tie into different strategies for interacting with users - from creation of data/content to query strategies. Any thoughts? Thanks, Bob


>Mark,
>
>Le Sunday, April 27, 2008 6:44 PM [GMT+1=CET],
>Mark Davies <Mark_Davies at byu.edu> a écrit :
>> Most really large corpora that I'm aware of do use a relational
>> database architecture, including systems like IMS Corpus Workbench.
>
>The IMS Corpus Workbench software's architecture is based on
>specific indexing technics related to textual data processing and querying.
>Those techniques where described in the book :
>"Managing GigabytesCompressing and Indexing Documents and Images"
>De Ian H. Witten, Alistair Moffat, Timothy C. Bell, 1999, Morgan Kaufmann.
>No RDBMS system or architecture the-like was used and this can
>be seen from the source : http://cwb.sourceforge.net/
>
>Best,
>Serge
>
>_____________________________________________________________
>Serge Heiden, slh at ens-lsh.fr, https://weblex.ens-lsh.fr
>ENS-LSH/CNRS - ICAR UMR5191, Institut de Linguistique Française
>15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33(0)622003883
>
>
>_______________________________________________
>Corpora mailing list
>Corpora at uib.no
>http://mailman.uib.no/listinfo/corpora

-- * The best dictionary and integrated thesaurus on the web: http://www.wordsmyth.net * Robert Parks - Wordsmyth - (607) 272-2190 * "To imagine a language is to imagine a form of life." (LW) * "Philosophers have only interpreted the world. The point, however, is to change it." (KM) * Community grows as we communicate, honing our words till their meanings tap the rich voice of our full human potential.



More information about the Corpora mailing list