[Corpora-List] Converting Stanford typed dependencies to universal dependencies for English

Amir Zeldes Amir.Zeldes at georgetown.edu
Thu Sep 3 15:45:15 CEST 2015

Hi Richard,

Thanks, that's exactly the sort of thing I'm looking for! It's trying to convert PTB brackets to dependencies first, but that can be skipped so it's no problem (my data is native dependencies).

I can see where it's converting all of the POS tags, changing the labels, and also doing the conditional changes such as the different types of 'to'. What I'm not seeing yet is the re-wiring of dependency edges for things like 'case' or 'name', but maybe I'm just missing it. If you or anyone else knows more about that, I'd appreciate a message off-list.

Thanks again, Amir

-----Original Message-----

Hi Amir,

since the English data used in the Universal Dependency Treebank cannot be freely distributed, they include code to automatically tag/convert it. If I remember correctly, it uses an old version of the Stanford Parser and applies a transformation to the universal categories.


You find the code in the "std/en" folder of universal_treebanks_v2.0.tar.gz.

I didn't try it, but it might be what you are looking for.


-- Richard

On 31.08.2015, at 21:37, Amir Zeldes <Amir.Zeldes at georgetown.edu> wrote:

> Hi everyone,
> I'm wondering if anybody has or knows of a script for converting Stanford
Typed Dependencies to the Universal Dependencies scheme for English. I realize it's non-trivial because of the different handling of propositions, case, names etc. but I think with some heuristics a good baseline solution might be possible. Has anyone worked on a tool to do this automatically?
> Thanks,
> Amir

