******************** * Evaluation Data * ******************** FileA: outPaiceJava.txt FileB: outPorterOrig.txt The mean number of words pre conflation class: 1.5957167 The Index Compession Factor: 0.37332234 The number of words and stems that differ: 151264 The mean number characters removed: 1.0405544 The median number characters removed: 0.0 The mode number characters removed: 0 The chaacters removed table: Number of words with 0 Chars Removed 174721 Number of words with 1 Chars Removed 34134 Number of words with 2 Chars Removed 55310 Number of words with 3 Chars Removed 25239 Number of words with 4 Chars Removed 7964 Number of words with 5 Chars Removed 7146 Number of words with 6 Chars Removed 2611 Number of words with 7 Chars Removed 1363 Number of words with 8 Chars Removed 627 Number of words with 9 Chars Removed 240 Number of words with 10 Chars Removed 109 Number of words with 11 Chars Removed 23 Number of words with 12 Chars Removed 20 Number of words with 13 Chars Removed 3 The mean Hamming distance: 1.1114956 The median Hamming distance: 0.0 The mode Hamming distance: 0 The Hamming distance table: Number of words with 0 Hamming distance 158246 Number of words with 1 Hamming distance 49587 Number of words with 2 Hamming distance 54203 Number of words with 3 Hamming distance 26179 Number of words with 4 Hamming distance 8601 Number of words with 5 Hamming distance 7454 Number of words with 6 Hamming distance 2696 Number of words with 7 Hamming distance 1400 Number of words with 8 Hamming distance 710 Number of words with 9 Hamming distance 262 Number of words with 10 Hamming distance 119 Number of words with 11 Hamming distance 28 Number of words with 12 Hamming distance 22 Number of words with 13 Hamming distance 3 The Fox and Frakes Similarity Metric: 0.89968866 The Chris O'Neill Similarity Metric: 85.93629938941254%