Save This Page
Home » lucene-3.0.1-src » org.apache » lucene » analysis » cn » [javadoc | source]
org.apache.lucene.analysis.cn
public final class: ChineseTokenizer [javadoc | source]
java.lang.Object
   org.apache.lucene.util.AttributeSource
      org.apache.lucene.analysis.TokenStream
         org.apache.lucene.analysis.Tokenizer
            org.apache.lucene.analysis.cn.ChineseTokenizer

All Implemented Interfaces:
    Closeable

Tokenize Chinese text as individual chinese characters.

The difference between ChineseTokenizer and CJKTokenizer is that they have different token parsing logic.

For example, if the Chinese text "C1C2C3C4" is to be indexed:

Therefore the index created by CJKTokenizer is much larger.

The problem is that when searching for C1, C1C2, C1C3, C4C2, C1C2C3 ... the ChineseTokenizer works, but the CJKTokenizer will not work.

Fields inherited from org.apache.lucene.analysis.Tokenizer:
input
Constructor:
 public ChineseTokenizer(Reader in) 
 public ChineseTokenizer(AttributeSource source,
    Reader in) 
 public ChineseTokenizer(AttributeFactory factory,
    Reader in) 
Method from org.apache.lucene.analysis.cn.ChineseTokenizer Summary:
end,   incrementToken,   reset,   reset
Methods from org.apache.lucene.analysis.Tokenizer:
close,   correctOffset,   reset
Methods from org.apache.lucene.analysis.TokenStream:
close,   end,   incrementToken,   reset
Methods from org.apache.lucene.util.AttributeSource:
addAttribute,   addAttributeImpl,   captureState,   clearAttributes,   cloneAttributes,   equals,   getAttribute,   getAttributeClassesIterator,   getAttributeFactory,   getAttributeImplsIterator,   hasAttribute,   hasAttributes,   hashCode,   restoreState,   toString
Methods from java.lang.Object:
clone,   equals,   finalize,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.lucene.analysis.cn.ChineseTokenizer Detail:
 public final  void end() 
 public boolean incrementToken() throws IOException 
 public  void reset() throws IOException 
 public  void reset(Reader input) throws IOException