******************** * Evaluation Data * ******************** FileA: document FileB: outPaiceJava.txt The mean number of words pre conflation class: 1.2104455 The Index Compression Factor: 0.17385787 The number of words and stems that differ: 486 The mean number characters removed: 1.7944162 The median number characters removed: 1.0 The mode number characters removed: 0 The characters removed table: Number of words with 0 Chars Removed 302 Number of words with 1 Chars Removed 104 Number of words with 2 Chars Removed 148 Number of words with 3 Chars Removed 92 Number of words with 4 Chars Removed 55 Number of words with 5 Chars Removed 39 Number of words with 6 Chars Removed 24 Number of words with 7 Chars Removed 16 Number of words with 8 Chars Removed 5 Number of words with 9 Chars Removed 3 The mean Hamming distance: 1.8274112 The median Hamming distance: 1.0 The mode Hamming distance: 0 The Hamming distance table: Number of words with 0 Hamming distance 302 Number of words with 1 Hamming distance 103 Number of words with 2 Hamming distance 140 Number of words with 3 Hamming distance 97 Number of words with 4 Hamming distance 54 Number of words with 5 Hamming distance 40 Number of words with 6 Hamming distance 27 Number of words with 7 Hamming distance 16 Number of words with 8 Hamming distance 5 Number of words with 9 Hamming distance 4 The Fox and Frakes Similarity Metric: 0.5472222 The Chris O'Neill Similarity Metric: 77.70298830210662%