[Corpora-List] Grep for Windows

D.W.Hardcastle D.W.Hardcastle at open.ac.uk
Fri Dec 15 19:43:01 CET 2006


It's not a grep tool as such, but you can write better scripted/command
line processing of text files with Windows PowerShell (a little like
scripting with .NET) than with DOS commands, although I doubt it stands
up to Perl or Python for performance.
http://msdn2.microsoft.com/en-us/library/ms714674.aspx

There is some information on regex and PowerShell here, I'm not sure
what the coverage is like, being Microsoft perhaps they have reinvented
their own syntax!
http://www.microsoft.com/technet/scriptcenter/topics/msh/payette1.mspx


Regards,

Dave



--
David Hardcastle
Research Programmer, Natural Language Generation Group
Faculty of Mathematics and Computing, room 121, North Spur
The Open University, Walton Hall, Milton Keynes, MK7 6AA
+44 (0) 1908 659947

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Mark Davies
Sent: 15 December 2006 15:00
To: corpora at hd.uib.no
Subject: [Corpora-List] Grep for Windows

This next semester, I'd like to have the students in my Corpus
Linguistics class learn to use Grep tools for searching large corpora. I
know there's many great, fast Unix tools, but these students will be
using Windows machines. If possible, the program would have the
following features:

-- Fast, since they'll be working with fairly large corpora (100 million
words and more)
-- Obviously, full regular expressions capability
-- Not run under Cygwin or a similar program, but rather as a native
Windows app

I've already looked at PowerGrep, V-Grep, and TextPad, but none of these
are adequate. Any other suggestions? Thanks in advance.

Mark Davies

============================================
Mark Davies
Professor of (Corpus) Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906
Web: davies-linguistics.byu.edu

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================






More information about the Corpora-archive mailing list