org.apache.lucene.analysis.cn
Class ChineseFilter
public final class ChineseFilter
Title: ChineseFilter
Description: Filter with a stop word table
Rule: No digital is allowed.
English word/token should larger than 1 character.
One Chinese character as one Chinese word.
TO DO:
1. Add Chinese stop words, such as \ue400
2. Dictionary based Chinese word extraction
3. Intelligent Chinese word extraction
Copyright: Copyright (c) 2001
Company:
Token | next() - Returns the next token in the stream, or null at EOS.
|
STOP_WORDS
public static final String[] STOP_WORDS
Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.