org.incava.text
Class SpellChecker

java.lang.Object
  extended by org.incava.text.SpellChecker
Direct Known Subclasses:
NoCaseSpellChecker

public class SpellChecker
extends java.lang.Object

Calculates the edit distance between two strings.


Field Summary
protected static int ARR_SIZE
           
protected static int COMP_LEN
           
static int DEFAULT_MAX_DISTANCE
           
 
Constructor Summary
SpellChecker()
           
 
Method Summary
 boolean addDictionary(java.lang.String dictionary)
          Adds the given dictionary.
 void addWord(java.lang.String word)
           
protected  int compare(java.lang.String str1, int len1, java.lang.String str2, int len2)
          Compares the two characters.
 int editDistance(java.lang.String str1, java.lang.String str2)
          Computes the Levenstein edit distance between the two words, with a maximum of 3, at which point the distance is no longer computed.
 int editDistance(java.lang.String str1, java.lang.String str2, int maximum)
          Computes the Levenstein edit distance between the two words.
 java.lang.String getKey(java.lang.String word)
           
 boolean hasWord(java.lang.String word)
           
 boolean isCorrect(java.lang.String word, int maxEditDistance, java.util.Map nearMatches)
           
 boolean isCorrect(java.lang.String word, java.util.Map nearMatches)
           
protected static int min3(int x, int y, int z)
           
 boolean nearMatch(java.lang.String str1, java.lang.String str2)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_MAX_DISTANCE

public static final int DEFAULT_MAX_DISTANCE
See Also:
Constant Field Values

COMP_LEN

protected static final int COMP_LEN
See Also:
Constant Field Values

ARR_SIZE

protected static final int ARR_SIZE
See Also:
Constant Field Values
Constructor Detail

SpellChecker

public SpellChecker()
Method Detail

editDistance

public int editDistance(java.lang.String str1,
                        java.lang.String str2)
Computes the Levenstein edit distance between the two words, with a maximum of 3, at which point the distance is no longer computed.


editDistance

public int editDistance(java.lang.String str1,
                        java.lang.String str2,
                        int maximum)
Computes the Levenstein edit distance between the two words.


nearMatch

public boolean nearMatch(java.lang.String str1,
                         java.lang.String str2)

addDictionary

public boolean addDictionary(java.lang.String dictionary)
Adds the given dictionary. Returns whether it could be read and had content.


getKey

public java.lang.String getKey(java.lang.String word)

addWord

public void addWord(java.lang.String word)

hasWord

public boolean hasWord(java.lang.String word)

isCorrect

public boolean isCorrect(java.lang.String word,
                         int maxEditDistance,
                         java.util.Map nearMatches)
Parameters:
nearMatches - a map from edit distances to matches.

isCorrect

public boolean isCorrect(java.lang.String word,
                         java.util.Map nearMatches)

compare

protected int compare(java.lang.String str1,
                      int len1,
                      java.lang.String str2,
                      int len2)
Compares the two characters. English words should probably be case insensitive; code should not.


min3

protected static int min3(int x,
                          int y,
                          int z)