Miss Elliott? Argh, I remember her!
All seriousness aside, I think we all know what's going on here. Irregardless (ouch, I'm sorry Miss Elliott, that's "regardless"!) of whether we think of this as a problem for humans (expert or otherwise) or for computers, there's a clustering problem. We have a bunch of uses of some word in texts. Some expert(s) have eyeballed a subset of these tokens-in-text and decided to make N clusters, which are the N senses. Later on someone else comes along and tries to cluster a new set of tokens-in-text using these same N clusters (and still later we get a computer to try to replicate those clusters-of-tokens).
But who's to say what N should be? It is well known that given M experts who are asked to create senses based on the same corpus, for any moderately polysemous word you'll get at least M different Ns. (Look at any two dictionaries.) So asking those M experts to tag senses according to the N senses chosen by one of those experts is asking someone to perform a task that they don't really believe in. If you got those experts in a room, you'd have http://www.arthermitage.org/Adriaen-van-Ostade/Brawl.html.
Granted, when experts tag word tokens for sense as input to a machine learning algorithm, they probably aren't the experts who created the N sense clusters in the first place. But I can still hear them saying, "What idiot broke things up this way? Can you say, 'Gerrymandering'?"
(And no, I did not like the way Miss Elliott diagrammed sentences. Unfortunately, when I did my PhD, I found out she was right.)