Input Data. There were four annotators involved in the annotation of the data used for the evaluation – an experienced annotator (annotator 1) and three students of the Czech language (annotators 2, 3 and 4). The annotation was started by annotators 1, 2 and 3, at the end of 2002 annotator 1 left and was replaced by annotator 4. Therefore we can compare three versions of annotation for each file concerned. The whole data set consists of 441 triples of annotated sentence structures. Table 1 gives an overview of the files annotated in parallel and their respective sizes in the number of trees and nodes. Phase # files # trees # nodes Annotators 1 (spring 2002) 2 94 1338 1, 2, 3 2 (autumn 2002) 1 48 825 1, 2, 3 3 (spring 2003) 1 52 702 2, 3, 4 4 (autumn 2003) 5 247 3537 2, 3, 4 Total 9 441 6402 Table 1: Data annotated in parallel All the numbers have been obtained using specific computational tools and subsequently manually checked and classified. The classification criteria and procedures will be described in corresponding subsections.
Appears in 2 contracts
Sources: Annotators’ Agreement, Annotators’ Agreement