Here's a summary of the pointers we got for the number and date normalization inquiry:
- ICU4J (http://icu-project.org/index.html) - a set of libraries for globalization purposes, including number and date formatting.
- A date normalizer by Mark Greenwood found at: http://www.dcs.shef.ac.uk/~mark/dev/java/index.html
- Unix date program, part of GNU coreutils: http://www.gnu.org/software/coreutils/
- hCalendar ( <http://microformats.org/wiki/hcalendar> http://microformats.org/wiki/hcalendar , a microformats standard for calendaring and events format.
- TempEx: for date and time expression tagging by George Wilson: http://timex2.mitre.org/cgi-bin/download?file=TempEx_R1_05_03.tar
Thanks to Trevor Jenkins, Michael Hawkes, Mark Greenwood and George Wilson for their help.
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Shachar Mirkin Sent: Thursday, July 24, 2008 8:25 PM To: corpora at uib.no Subject: [Corpora-List] number and dates normalization
I'm looking for an available package (preferably Java) for numbers and dates normalization, that given "fifteen hundred" will return "1500" and given "January, 23 1987" will return a date in some predefined schema, e.g. "23/1/87".
Anyone knows of such a tool?
Bar-Ilan University, Israel
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 9931 bytes Desc: not available Url : http://www.uib.no/mailman/public/corpora/attachments/20080729/b53499d4/attachment.txt