[Corpora-List] Bilingual dictionary request

Francis Bond bond at ieee.org
Mon Oct 1 02:53:20 CEST 2018


G'day,

You can easily make one from the Open Multilingual Wordnet (OMW) using NLTK.

I attach code to do it, and the dictionary. Note that the top of the dictionary file is the license and citation information. Please remember to cite the Arabic Wordnet, Princeton Wordent, the OMW, and NLTK if you use this. --- omw2bi.py --- # make a bilingual dictionary from OMW # Francis Bond (2018) # Released into the public domain # # python3 omw2bi.py > arb_eng.tsv # from nltk.corpus import wordnet as wn

print("we have", " ".join(wn.langs()))

lg2='eng' lg1='arb'

for lg in [lg1, lg2, 'omw']:

print(wn.license(lg))

print(wn.citation(lg))

print('#' *72)

for ss in wn.all_synsets():

for l1 in ss.lemma_names(lang=lg1):

for l2 in ss.lemma_names(lang=lg2):

print(l1.replace('_', ' '),

l2.replace('_', ' '),

sep='\t') ------

On Mon, Oct 1, 2018 at 2:14 AM safae berrichi <berrichi.safae at gmail.com> wrote:
>
> Dear Colleagues,
>
> I am a Ph.D. student in natural language processing and I am currently working on Statistical Machine Translation.
> I need a digital version of an Arabic-English bilingual dictionary.
> Can you tell me if there is a freely available version?
>
> Thank you in advance.
>
>
> --
>
> =========================================
> Safae BERRICHI
> PhD Candidate in Computer Science Laboratory
> Department of Mathematics and Computer Science
> Faculty of Science, Mohammed First University
> Oujda, Morocco.
> Tel: (+212)6 50 36 08 79
> =========================================
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora

-- Francis Bond <http://www3.ntu.edu.sg/home/fcbond/> Division of Linguistics and Multilingual Studies Nanyang Technological University -------------- next part -------------- A non-text attachment was scrubbed... Name: arb_eng.tsv Type: text/tab-separated-values Size: 2520138 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20181001/b1578034/attachment-0001.tsv> -------------- next part -------------- A non-text attachment was scrubbed... Name: omw2bi.py Type: text/x-python Size: 558 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20181001/b1578034/attachment-0001.py>



More information about the Corpora mailing list