******************** * Evaluation Data * ******************** FileA: document FileB: outPorterOrig.txt The mean number of words pre conflation class: 1.1453488 The Index Compression Factor: 0.12690355 The number of words and stems that differ: 394 The mean number characters removed: 1.1332487 The median number characters removed: 0.0 The mode number characters removed: 0 The characters removed table: Number of words with 0 Chars Removed 426 Number of words with 1 Chars Removed 106 Number of words with 2 Chars Removed 106 Number of words with 3 Chars Removed 68 Number of words with 4 Chars Removed 47 Number of words with 5 Chars Removed 28 Number of words with 6 Chars Removed 6 Number of words with 7 Chars Removed 1 The mean Hamming distance: 1.1865482 The median Hamming distance: 0.0 The mode Hamming distance: 0 The Hamming distance table: Number of words with 0 Hamming distance 394 Number of words with 1 Hamming distance 135 Number of words with 2 Hamming distance 104 Number of words with 3 Hamming distance 71 Number of words with 4 Hamming distance 49 Number of words with 5 Hamming distance 28 Number of words with 6 Hamming distance 6 Number of words with 7 Hamming distance 1 The Fox and Frakes Similarity Metric: 0.84278077 The Chris O'Neill Similarity Metric: 86.23843901445424%