******************** * Evaluation Data * ******************** FileA: document FileB: outPorterOrig.txt The mean number of words pre conflation class: 1.5957167 The Index Compession Factor: 0.37332234 The number of words and stems that differ: 194407 The mean number characters removed: 1.4325644 The median number characters removed: 1.0 The mode number characters removed: 0 The chaacters removed table: Number of words with 0 Chars Removed 137414 Number of words with 1 Chars Removed 54567 Number of words with 2 Chars Removed 40847 Number of words with 3 Chars Removed 34388 Number of words with 4 Chars Removed 23111 Number of words with 5 Chars Removed 10959 Number of words with 6 Chars Removed 2576 Number of words with 7 Chars Removed 4330 Number of words with 8 Chars Removed 945 Number of words with 9 Chars Removed 345 Number of words with 10 Chars Removed 11 Number of words with 11 Chars Removed 16 Number of words with 12 Chars Removed 1 The mean Hamming distance: 1.5087622 The median Hamming distance: 1.0 The mode Hamming distance: 0 The Hamming distance table: Number of words with 0 Hamming distance 115103 Number of words with 1 Hamming distance 76653 Number of words with 2 Hamming distance 40535 Number of words with 3 Hamming distance 34673 Number of words with 4 Hamming distance 23106 Number of words with 5 Hamming distance 11216 Number of words with 6 Hamming distance 2574 Number of words with 7 Hamming distance 4332 Number of words with 8 Hamming distance 945 Number of words with 9 Hamming distance 345 Number of words with 10 Hamming distance 11 Number of words with 11 Hamming distance 16 Number of words with 12 Hamming distance 1 The Fox and Frakes Similarity Metric: 0.66279495 The Chris O'Neill Similarity Metric: 85.28545246363186%