public static class WhitespaceTokenizer.WhitespaceTokenizerFactory<T extends HasWord> extends java.lang.Object implements TokenizerFactory<T>
Constructor and Description |
---|
WhitespaceTokenizerFactory(LexedTokenFactory<T> factory) |
WhitespaceTokenizerFactory(LexedTokenFactory<T> factory,
boolean tokenizeNLs) |
WhitespaceTokenizerFactory(LexedTokenFactory<T> factory,
java.lang.String options) |
Modifier and Type | Method and Description |
---|---|
java.util.Iterator<T> |
getIterator(java.io.Reader r)
Return an iterator over the contents read from r.
|
Tokenizer<T> |
getTokenizer(java.io.Reader r)
Get a tokenizer for this reader.
|
Tokenizer<T> |
getTokenizer(java.io.Reader r,
java.lang.String extraOptions)
Get a tokenizer for this reader.
|
static TokenizerFactory<Word> |
newTokenizerFactory()
Constructs a new TokenizerFactory that returns Word objects and
treats carriage returns as normal whitespace.
|
void |
setOptions(java.lang.String options)
Sets default options for how tokenizers built from this factory should behave.
|
public WhitespaceTokenizerFactory(LexedTokenFactory<T> factory)
public WhitespaceTokenizerFactory(LexedTokenFactory<T> factory, java.lang.String options)
public WhitespaceTokenizerFactory(LexedTokenFactory<T> factory, boolean tokenizeNLs)
public static TokenizerFactory<Word> newTokenizerFactory()
public java.util.Iterator<T> getIterator(java.io.Reader r)
IteratorFromReaderFactory
getIterator
in interface IteratorFromReaderFactory<T extends HasWord>
r
- Where to read objects frompublic Tokenizer<T> getTokenizer(java.io.Reader r)
TokenizerFactory
getTokenizer
in interface TokenizerFactory<T extends HasWord>
r
- A Reader (which is assumed to already by buffered, if appropriate)public Tokenizer<T> getTokenizer(java.io.Reader r, java.lang.String extraOptions)
TokenizerFactory
getTokenizer
in interface TokenizerFactory<T extends HasWord>
r
- A Reader (which is assumed to already by buffered, if appropriate)extraOptions
- Options for how this tokenizer should behavepublic void setOptions(java.lang.String options)
TokenizerFactory
setOptions
in interface TokenizerFactory<T extends HasWord>
options
- Options for how this tokenizer should behave