[Corpora-List] An ignorant question concerning the basics of statistical significance > REPRESENTATIVENESS

Mcenery, Tony a.mcenery at lancaster.ac.uk
Tue Feb 3 22:04:45 CET 2015

Angus - I one hundred percent agree. Cherry picking data and denying the structure in what we observe is not simply silly, it also drains explanatory and descriptive power from research. In that spirit, approximating, in a simulacrum, what we want to observe is an essential step not just for linguists, but for social scientists in general, imho. But we must also be mindful that in all modelling we are dealing with approximation (and usually idealization). That was the spirit of my earlier comment, I guess. Anyway - keep battling on, I think you are on the right lines!

________________________________ From: corpora-bounces at uib.no [corpora-bounces at uib.no] on behalf of Angus Grieve-Smith [grvsmth at panix.com] Sent: 03 February 2015 20:06 To: corpora at uib.no Subject: Re: [Corpora-List] An ignorant question concerning the basics of statistical significance > REPRESENTATIVENESS

Thanks for your response, Tony. Sorry, but some handwringing is definitely due here.

I've actually been fighting with sociologists and epidemiologists over their notions of representativeness in an area that affects me personally (transgender studies). These social scientists practice almost no representative sampling, but rather than stick to qualitative, existential observations, they simply collect quantitative data ad hoc and pretend that it can be generalized to entire populations (with a disclaimer that is universally ignored). This has clear and demonstrable negative effects.

I think that kind of hand-waving is a horrible example to emulate, and I want no part of it. However, I think Chris Brew made a good point, and I will address it in a separate message.

On 2/3/2015 11:47 AM, Mcenery, Tony wrote:

Hi Angus,

I believe Lou pretty much has it in one. I have had some interesting discussions about representativeness with social scientists. Our approach to representativeness is by necessity more fluid and impressionistic than that used in some social sciences, though it is also similar to that used by other social scientists. To reach perfect representativeness we need, as Ramesh suggests, a good model of what we are representing. However, we have that for language no more than panel surveys (e.g. the UK household survey) has that for the whole of the UK. So we select factors and we work towards making sure that you can study those with that data. So the type of statement Lou made is important - it is useful to know what corpus builders intended you to be able to study using their data in just the same way as it is important to know what a panel survey intended you to be able to study using it. So I would still appeal to representativeness as a notion and as an ideal. But I accept tha

t, when

operationalised. corpora (with rare exceptions) approximate to, rather than achieve, this ideal. But this is far from unusual in the social sciences, so we need not hand wring unduly. Anyway, those are my thoughts on the matter Angus, as you asked for them.


-Angus B. Grieve-Smith

grvsmth at panix.com<mailto:grvsmth at panix.com>

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4008 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150203/206af24c/attachment.txt>

More information about the Corpora mailing list