[Corpora-List] number and dates normalization

Shachar Mirkin mirkins at macs.biu.ac.il
Tue Jul 29 17:17:06 CEST 2008


Hi,

Here's a summary of the pointers we got for the number and date normalization inquiry:

- ICU4J (http://icu-project.org/index.html) - a set of libraries for globalization purposes, including number and date formatting.

- A date normalizer by Mark Greenwood found at: http://www.dcs.shef.ac.uk/~mark/dev/java/index.html

- Unix date program, part of GNU coreutils: http://www.gnu.org/software/coreutils/

- hCalendar ( <http://microformats.org/wiki/hcalendar> http://microformats.org/wiki/hcalendar , a microformats standard for calendaring and events format.

- TempEx: for date and time expression tagging by George Wilson: http://timex2.mitre.org/cgi-bin/download?file=TempEx_R1_05_03.tar

Thanks to Trevor Jenkins, Michael Hawkes, Mark Greenwood and George Wilson for their help.

Shachar

_____

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Shachar Mirkin Sent: Thursday, July 24, 2008 8:25 PM To: corpora at uib.no Subject: [Corpora-List] number and dates normalization

Hi,

I'm looking for an available package (preferably Java) for numbers and dates normalization, that given "fifteen hundred" will return "1500" and given "January, 23 1987" will return a date in some predefined schema, e.g. "23/1/87".

Anyone knows of such a tool?

Thanks,

Shachar Mirkin

Bar-Ilan University, Israel

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 9931 bytes Desc: not available Url : http://www.uib.no/mailman/public/corpora/attachments/20080729/b53499d4/attachment.txt



More information about the Corpora mailing list