[Corpora-List] Is Language Identification Really Solved?

Erin McKean erin at logocracy.com
Mon Jun 29 23:10:22 CEST 2015


I don't like to blindly click on shortened links, either, so if I need to open them I often use a site such as http://urlxray.com/

Hope this helps!

Erin

On 6/28/15 8:39 PM, maxwell wrote:
> On 6/26/2015 4:09 AM, liling tan wrote:
>> ...
>> *How much a misconception is language identification a "solved task"?*.
>> 5 years ago, there was some discussion: http://goo.gl/vB4CVb
>> ...
>> Most recent, Discriminating between Similar Languages (DSL) Shared Task
>> also shows that what we know about language ID is still far from
>> perfect: https://goo.gl/PBtXjd
>
> Perfect is a pretty high standard. At any rate, you might have a look
> at this:
> http://indigenoustweets.blogspot.com/2011/12/1000-languages-on-web.html
>
> BTW, IMO it would be better not to use URL shorteners (like goo.gl) in
> emails. Many people--myself included--will be hesitant to click on such
> things.



More information about the Corpora mailing list