Task description

Lexical substitution is the task of identifying an appropriate substitute for a target word in a given context. For example, in the sentence "She's a bright kid who excels academically," an appropriate substitute for "bright" might be "smart", whereas an inappropriate one would be "glowing". Automatically identifying substitution candidates, and selecting those which best match the context, requires intelligent application of lexical-semantic knowledge and word sense disambiguation techniques. However, unlike traditional WSD tasks, lexical substitution does not mandate the use of any particular sense inventory.

The data for the GermEval 2015: LexSub task is described in Cholakov et al., 2014. All together it consists of 2040 sentences from the German Wikipedia, each containing a target word and a list of substitutions proposed by human annotators. There are 153 unique target words, equally distributed across parts of speech (nouns, verbs, and adjectives) and three frequency groups. About half of this data (26 nouns, 26 verbs, and 26 adjectives in 1040 sentence contexts) forms the training set, which will be made available to participants in advance. The remainder forms the test set, which will be used for the evaluation and published in full only after the shared task is completed.

Participants need not rely on any particular language resources, but if they wish they can employ the sense-linked lexical-semantic resource UBY (Gurevych et al., 2012) and JoBimText distributional semantics models (Biemann et al., 2013). UBY also provides an interface to GermaNet (Hamp and Feldweg, 2007; Henrich and Hinrichs, 2010). Industrial users will be eligible to a special GermaNet licence to be obtained Eberhard-Karls Universität Tübingen. Please refer to our pages on how to obtain the data sets and resources.

Systems' performance will be measured by comparing their substitutes against those selected by the human annotators; for this we will use the "best" and "out of ten" metrics described by McCarthy and Navigli (2009), and the "generalized average precision" metric from Kishida (2005). The organizers will provide a scoring system and the output of some baseline systems.

Submissions will consist of a file providing the substitutions for each instance of the target data and a paper of up to four pages (including references) describing the approach and analyzing the performance. Papers should follow the GSCL 2015 style guide, and will be reviewed and published in an online volume of workshop proceedings. (We may ask participants to peer-review other submissions.) Participants are expected to present summaries of their systems at the GermEval 2015: LexSub workshop at GSCL 2015.