public abstract class AbstractTokenizer<T> extends java.lang.Object implements Tokenizer<T>
getNext()
method. This implementation does not
allow null tokens, since
null is used in the protected nextToken field to signify that no more
tokens are available.Modifier and Type | Field and Description |
---|---|
static java.lang.String |
NEWLINE_TOKEN
For tokenizing carriage returns.
|
protected T |
nextToken |
Constructor and Description |
---|
AbstractTokenizer() |
Modifier and Type | Method and Description |
---|---|
protected abstract T |
getNext()
Internally fetches the next token.
|
boolean |
hasNext()
Returns
true if this Tokenizer has more elements. |
T |
next()
Returns the next token from this Tokenizer.
|
T |
peek()
This is an optional operation, by default supported.
|
void |
remove()
This is an optional operation, by default not supported.
|
java.util.List<T> |
tokenize()
Returns text as a List of tokens.
|
public static final java.lang.String NEWLINE_TOKEN
tokenizeNLs = true
. It is assumed that no tokenizer allows *NL* as a token.
This is certainly true for PTBTokenizer-derived tokenizers, where the asterisks would
become separate tokens.protected T nextToken
protected abstract T getNext()
public T next()
next
in interface java.util.Iterator<T>
java.util.NoSuchElementException
- if the token stream has no more tokens.public boolean hasNext()
true
if this Tokenizer has more elements.hasNext
in interface java.util.Iterator<T>
public void remove()
remove
in interface java.util.Iterator<T>
public T peek()