[Corpora-List] Corpus for transliteration of names

Martin Reynaert reynaert at uvt.nl
Thu Apr 28 00:23:35 CEST 2016

Dear Grishma,

I happen to have been working on a name transliteration system myself for the past year or so. That is for person and place names.

Great resources for person names are JRC-Names and for places Geonames.

Your query does not specify what kind of names you work on. You tell us very little about the actual languages you have in mind. And personally I am at a loss when you mention temporal features. Would that be biographical data about people whose names you would want to transliterate, or might that be e.g. older, disused names for cities somewhere?

Ranking about 1K candidates for any name seems to me like a tough nut to crack. Perhaps you should cast your net less wide?



On 27/04/16 18:20, Grishma Jena wrote:
> Hello,
> I'm working of transliteration of names for low-resource languages.
> I'm trying to incorporate more features that would help in creating a
> reranker for the candidate list. Basically, I have a list of 1000
> candidates for each name and I'm trying to find which is the correct
> transliteration. Is there any dataset available from which I can get
> some features for the reranker, especially temporal features?
> Thank you.
> --
> Regards,
> Grishma Jena
> MSE Computer and Information Sciences
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2818 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160428/0e7fa883/attachment.txt>

More information about the Corpora mailing list