[Corpora-List] Need POST system for French and English

Michele Filannino michele.filannino at cs.manchester.ac.uk
Wed May 9 11:57:36 CEST 2012


If you want to solve that kind of problems you could easily write a spell-checker corrector using a language model that considers subparts of each word. The pattern "you -> u" will emerge. Alternatively, if you have a constrained vocabulary you could use Damerau-Levenshtein distance measure<http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance> among words.

Bye, Michele Filannino.

CDT PhD student in Computer Science Room IT301 - IT Building The University of Manchester filannim at cs.manchester.ac.uk

On Wed, May 9, 2012 at 9:04 AM, Renaud Richardet <renaud.richardet at epfl.ch>wrote:


> Dear Imad,
>
> You can ask Nicolas Hernandez (see
> http://www.mail-archive.com/opennlp-users@incubator.apache.org/msg00564.html)
> for POS taggers in french.
>
> Regarding "compyouter", that might be more difficult to map…
>
> All the best, Renaud
>
>
> --
> Renaud Richardet
> Blue Brain Project PhD candidate
> EPFL Station 15
> CH-1015 Lausanne
>
>
> On Wed, May 9, 2012 at 4:35 AM, imad eddin Jerbi <
> jerbi.imad.eddin at gmail.com> wrote:
>
>> *Dear Corpora Subscribers,*
>>
>> My name is Imad Eddin Jerbi, doing my master's thesis at Faculty of
>> Economics and Management of Sfax, Tunisia.
>> I am working on construction and morphosyntactic annotation of a Tunisian
>> dialect corpus.
>> I need a free and open source (JAVA) part of speech tagging system for
>> French and English.
>> This system has to do a linguistic correction first, because the input
>> could be an incorrect word.
>> *Example:*
>> Arabic Dialect: “كَمْبْيُوتَرْ “this word is original English language, I
>> converted to Latin characters using SAMPA for Arabic: “compyouter”
>> So, the system have to correct the input word “compyouter” to computer,
>> and then give us at the output the possible morphosyntactic annotation.
>> I would be very grateful if you could give me a names list of the best
>> available systems.
>> Thank you in advance.
>> Email: jerbi.imad.eddin at gmail.com
>>
>> *Best regards, *
>>
>> --
>>
>> Imad Eddin JERBI
>>
>> Student at Faculty of Economics and Management of Sfax
>>
>> http://www.fsegs.rnu.tn/
>>
>>
>> ANLP Research Group
>> http://sites.google.com/site/anlprg
>>
>> MIRACL Laboratory
>> www.miracl.rnu.tn
>>
>>
>> Page Web: https://sites.google.com/site/jerbiimadeddinanlp/
>> Email: jerbi.imad.eddin at gmail.com
>> Adress: El Wahheb, Chebba : 5170 - Mahdia - TUNISIE.
>> Gsm: +216 55688555
>>
>>
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>>
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5573 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120509/26c62a06/attachment.txt>



More information about the Corpora mailing list