[Corpora-List] visualization of terms in lines in kind of a Bayesian, navigational way ...

Albretch Mueller lbrtchx at gmail.com
Fri Jan 28 10:09:10 CET 2022

On 1/26/22, Emiel van Miltenburg <C.W.J.vanMiltenburg at tilburguniversity.edu> wrote:
> Concordances would also be my first thought.

I checked the concordance link which seemed to be an advertising page

> But given that the question was
> also about weight, I was imagining something like a Sankey/alluvial diagram
> where you take the top-k ngrams leading up to the relevant word, and draw
> weighted lines to that word with ngram frequency to determine the weight.

More like it.

> Thinking about this some more, maybe there's some relevant work on
> visualizing markov chains?

Visualizing inhomogeneous markov chains of the characters in text segments of an ongoingly upgrabable corpus in a way that from every single character to any text segment would be navigatable.

On 1/27/22, Steve Jeaco <steve.jeaco at xjtlu.edu.cn> wrote:
> ... The Prime Machine
> has a concordance line sorting algorithm which sorts lines according to
> collocations and items repeated in specific slots... the collocation measure
> uses Bayesian Information Criterion. It is described in IJCL 26(2)
> https://doi.org/10.1075/ijcl.18056.jea

Unfortunately I didn't have access to that paper. A google search on "The Prime Machine" shows that most of the talk about it was in relation to educational settings?

On 1/27/22, Mike Scott <mike at lexically.net> wrote:
> Your query reminds me of Kitsch & Van Dijk's article nearly fifty years
> ago:
> Kintsch, Walter & Teun van Dijk, 1978, Toward a Model of Text
> Comprehension and Production. Psychological Review, Vol 85, No. 5. pp.
> 363-394

discourses.org/OldArticles/Towards a model.pdf

That paper about (the comprehension of algorithmic) text summarizations and the associated physical constraints in the 1970's I found very interesting, even if kind of philosophical, not explicitly and exhaustively explained; or, probably, I was reading more into the model the authors were describing.

Thank you,


More information about the Corpora mailing list