Home » lucene-3.0.1-src » org.apache » lucene » analysis » compound » [javadoc | source]
org.apache.lucene.analysis.compound
abstract public class: CompoundWordTokenFilterBase [javadoc | source]
java.lang.Object
   org.apache.lucene.util.AttributeSource
      org.apache.lucene.analysis.TokenStream
         org.apache.lucene.analysis.TokenFilter
            org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase

All Implemented Interfaces:
    Closeable

Direct Known Subclasses:
    HyphenationCompoundWordTokenFilter, DictionaryCompoundWordTokenFilter

Base class for decomposition token filters.
Field Summary
public static final  int DEFAULT_MIN_WORD_SIZE    The default for minimal word length that gets decomposed 
public static final  int DEFAULT_MIN_SUBWORD_SIZE    The default for minimal length of subwords that get propagated to the output of this filter 
public static final  int DEFAULT_MAX_SUBWORD_SIZE    The default for maximal length of subwords that get propagated to the output of this filter 
protected final  CharArraySet dictionary     
protected final  LinkedList tokens     
protected final  int minWordSize     
protected final  int minSubwordSize     
protected final  int maxSubwordSize     
protected final  boolean onlyLongestMatch     
Fields inherited from org.apache.lucene.analysis.TokenFilter:
input
Constructor:
 protected CompoundWordTokenFilterBase(TokenStream input,
    String[] dictionary) 
 protected CompoundWordTokenFilterBase(TokenStream input,
    Set dictionary) 
 protected CompoundWordTokenFilterBase(TokenStream input,
    String[] dictionary,
    boolean onlyLongestMatch) 
 protected CompoundWordTokenFilterBase(TokenStream input,
    Set dictionary,
    boolean onlyLongestMatch) 
 protected CompoundWordTokenFilterBase(TokenStream input,
    String[] dictionary,
    int minWordSize,
    int minSubwordSize,
    int maxSubwordSize,
    boolean onlyLongestMatch) 
 protected CompoundWordTokenFilterBase(TokenStream input,
    Set dictionary,
    int minWordSize,
    int minSubwordSize,
    int maxSubwordSize,
    boolean onlyLongestMatch) 
Method from org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase Summary:
addAllLowerCase,   createToken,   decompose,   decomposeInternal,   incrementToken,   makeDictionary,   makeLowerCaseCopy,   reset
Methods from org.apache.lucene.analysis.TokenFilter:
close,   end,   reset
Methods from org.apache.lucene.analysis.TokenStream:
close,   end,   incrementToken,   reset
Methods from org.apache.lucene.util.AttributeSource:
addAttribute,   addAttributeImpl,   captureState,   clearAttributes,   cloneAttributes,   equals,   getAttribute,   getAttributeClassesIterator,   getAttributeFactory,   getAttributeImplsIterator,   hasAttribute,   hasAttributes,   hashCode,   restoreState,   toString
Methods from java.lang.Object:
clone,   equals,   finalize,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase Detail:
 protected static final  void addAllLowerCase(Set target,
    Collection col) 
 protected final Token createToken(int offset,
    int length,
    Token prototype) 
 protected  void decompose(Token token) 
 abstract protected  void decomposeInternal(Token token)
 public final boolean incrementToken() throws IOException 
 public static final Set makeDictionary(String[] dictionary) 
    Create a set of words from an array The resulting Set does case insensitive matching TODO We should look for a faster dictionary lookup approach.
 protected static char[] makeLowerCaseCopy(char[] buffer) 
 public  void reset() throws IOException