[Corpora-List] Urdu-Hindi Transliteration corpora

Shahzad Khan shahzad at gnowit.com
Sat Mar 17 18:15:22 CET 2018

Hi Nick, Around 10 years ago, I worked some with students to develop a rule based English to Urdu translation system (the input was in English and the output was in Urdu phonetically transliterated to English -- this was meant as an input for a text-to-speech system to help illiterate people access wikipedia etc).

If this is useful for you, I can try and dig it up -- the paper and code is somewhere in my archives. It was implemented in Java.

Perhaps you could use this to seed and/or bootstrap some examples ?

- Shahzad

On Sat, Mar 17, 2018 at 7:59 AM, Nick Ruiz <nruiz at interactions.com> wrote:

> Hi all,
> Can you help me identify any Urdu-Hindi parallel transliteration corpora
> that are available on the web? By transliteration, I mean strictly the
> conversion of writing systems, not translation. Thanks in advance!
> Kind regards,
> Nicholas Ruiz
> Interactions Labs
> ************************************************************
> *******************
> This e-mail and any of its attachments may contain Interactions LLC
> proprietary information, which is privileged, confidential, or subject to
> copyright belonging to the Interactions LLC. This e-mail is intended solely
> for the use of the individual or entity to which it is addressed. If you
> are not the intended recipient of this e-mail, you are hereby notified that
> any dissemination, distribution, copying, or action taken in relation to
> the contents of and attachments to this e-mail is strictly prohibited and
> may be unlawful. If you have received this e-mail in error, please notify
> the sender immediately and permanently delete the original and any copy of
> this e-mail and any printout. Thank You.
> ************************************************************
> *******************
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3114 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20180317/18e07c0b/attachment.txt>

More information about the Corpora mailing list