Some historical points:
1. Statistical methods for content analysis were pioneered by
Laswell (1948) and Berelson (1952), and they were computerized
as soon as computers became widely available. For references,
2. Charles C. Fries pioneered the use of corpora in language
analysis from the 1920s to the 1950s. For references, see
3. As early as 1947, Warren Weaver recognized the potential
for computers in machine translation. He was instrumental
in getting funding for it. He was also the coauthor with
Claude Shannon of _The Mathematical Theory of Communication_
(1949). That book stimulated a considerable body of research
in the application of statistical methods to language analysis.
4. Chomsky's thesis adviser, Zellig Harris, pioneered transformational
methods. Unlike Chomsky, Harris emphasized the use of corpora and
statistics. See the collection, _The Legacy of Zellig Harris_:
5. Victor Yngve, a pioneer in MT, was also a pioneer in using
statistics in language analysis. Hutchins summarizes both
6. As the director of the MT project at MIT, Yngve hired Chomsky as
a promising young PhD whose syntactic methods might be useful.
Chomsky also taught a course in linguistics and published his
notes as _Syntactic Structures_ (1957). In that book, Chomsky
strongly rejected statistical methods and the use of corpora.
7. In the 1980s, Fred Jelinek used statistical methods for a project
on speech recognition at IBM Research. John Cocke suggested that
similar methods might be useful for MT. In those days, they
swamped the capacity of the largest IBM mainframes. By the 1990s,
they could run on minicomputers and workstations.