We invite you to participate in the first shared task on Large Multi-lingual Hierarchical Multi-label Classification (MHMC) of Patents. The shared task’s workshop will be held at KONVENS and SwissText 2020.

Task Description:

We collected around 366k labelled patents from the European Patent Organization. The around 700 labels are organized by an ontology with a label description, a subset of 70k labels. The patents are in the following languages: English (75%), German (20%) and French (5%). The goal is to classify each patent to multiple labels of the ontology (hierarchical multi-label classification) and the task is divided into two subtasks to evaluate a good MHMC system but also to cope with zero-/few-shot scenario, which often appears in datasets with large label set.

Subtask A: Classify the patent as in a standard multi-lingual hierarchical multi-label document classification setup with a large amount of patents.

Subtask B: In this subtask, a zero-shot/few-shot approach is needed since some labels in the test set have very few or even zero training samples. We provide here the ontology with the descriptions of the classes.

The evaluation measure is the harmonic mean between micro and macro F-1 score. The test samples belonging to Subtask B will be not considered for the score measurement of Subtask A, i.e. Subtask A is a subsample of Subtask B.



15 January 2020: Start of Competition


17 March 2020: Start of Test Phase


24 March 2020: End of Competition


31 March 2020: Results Announcement


14 April 2020: System Description Submission


28 April 2020: Notification of Acceptance


5 May 2020: Camera-Ready Submission


23-25 June 2020: Presentation of Results at SwissText & KONVENS Joint

Conference 2020

For further information and updates, please check:




Dr. Fernando Benites (Zurich University of Applied Sciences (ZHAW), Switzerland)

Dr. Ahmad Aghaebrahimian (Zurich University of Applied Sciences (ZHAW), Switzerland)

Steffen Remus (University of Hamburg, Germany)

