Statement Segmentation in German Easy Language (StaGE)

This workshop is a GermEval-task and will be co-located with KONVENS 2024. We welcome contributions in English (preferred) and German. Prior to submitting a contribution, we kindly ask every participant / participating team to register via our form


German Easy Language (Leichte Sprache) is a controlled, simple language for which there have been several guidelines in the past. The draft of DIN SPEC 33429 has the potential to become the authoritative guideline. It describes several characteristics of German Easy language. Two of these are the number of statements and to format enumerations as lists. We have, therefore, segmented Easy Language sentences and annotated how well these aspects of DIN SPEC 33429 are implemented in web texts in German Easy language.

With this shared task, we want to analyze both the number and the segments of the statements and automate this annotation. According to our definition, a statement contains only obligatory additions and no optional (omissible) additions following the concept of verb valency. Example: "Das Glas steht heute Abend auf dem Tisch." (engl.: "The glass is on the table tonight.", literally: "the glass stands today evening on the table.") (Local adverbial: auf dem Tisch; temporal adverbial: heute Abend). In this example, the local adverbial is obligatory because "stehen" (eng: "standing") requires a place. The temporal adverbial is not obligatory and thus forms a new statement. Therefore, this example contains two statements.

The shared task comprises two subtasks:
Subtask 1: Detemining the number of statements
In the first subtask, the number of statements in the sentences should be predicted. The target is a whole number.
Subtask 2: Annotating the statement spans
In this subtask, the spans of the previously identified statements will be extracted. For details about the annotation and statement extraction, see our annotation guidelines.

Our aim is to analyze and evaluate existing German Easy language texts. We do not want to enforce the use of one-statement sentences but merely analyze the density of statements. One target group is German Easy language authors who can check their own texts automatically. These annotations can also serve as a data basis for machine learning applications in the field of readability analysis and fact checking.


We are looking forward to an interesting workshop at KONVENS 2024. Please see the timetable or contact us for more information:

     
09.03.2024 Trial data ready data/trial.csv
14.04.2024 Train data ready data/train.csv
18.05.2024 Test data ready  
13.06.2024 Evaluation start  
25.06.2024 Evaluation end  
01.07.2024 Paper submission due  
20.07.2024 Camera ready due  
13.09.2024 Workshop date