public class CoreLabelTokenFactory extends java.lang.Object implements CoreTokenFactory<CoreLabel>, LexedTokenFactory<CoreLabel>, java.io.Serializable
CoreLabel
s from Strings optionally with
beginning and ending (character after the end) offset positions in
an original text. The makeToken method will put the token in the
OriginalTextAnnotation AND TextAnnotation keys (2 places!),
and optionally records
begin and position after offsets in BeginPositionAnnotation and
EndPositionAnnotation. If the tokens are built in PTBTokenizer with
an "invertible" tokenizer, you will also get a BeforeAnnotation and for
the last token an AfterAnnotation. You can also get an empty CoreLabel token.Constructor and Description |
---|
CoreLabelTokenFactory()
Constructor for a new token factory which will add in the word, the
"current" annotation, and the begin/end position annotations.
|
CoreLabelTokenFactory(boolean addIndices)
Constructor that allows one to choose if index annotation
indicating begin/end position will be included in the label.
|
Modifier and Type | Method and Description |
---|---|
CoreLabel |
makeToken() |
CoreLabel |
makeToken(CoreLabel labelToBeCopied) |
CoreLabel |
makeToken(java.lang.String[] keys,
java.lang.String[] values) |
CoreLabel |
makeToken(java.lang.String tokenText,
int begin,
int length)
Constructs a CoreLabel as a String with a corresponding BEGIN and END position.
|
CoreLabel |
makeToken(java.lang.String tokenText,
java.lang.String originalText,
int begin,
int length)
Constructs a CoreLabel as a String with a corresponding BEGIN and END position,
when the original OriginalTextAnnotation is different from TextAnnotation
(Does not take substring).
|
public CoreLabelTokenFactory()
public CoreLabelTokenFactory(boolean addIndices)
addIndices
- if true, begin and end position annotations will be included (this is the default)public CoreLabel makeToken(java.lang.String tokenText, int begin, int length)
makeToken
in interface LexedTokenFactory<CoreLabel>
tokenText
- The String extracted by the lexer.begin
- The offset in the document of the first character in this string.length
- The number of characters the string takes up in the document.public CoreLabel makeToken(java.lang.String tokenText, java.lang.String originalText, int begin, int length)
public CoreLabel makeToken()
makeToken
in interface CoreTokenFactory<CoreLabel>
public CoreLabel makeToken(java.lang.String[] keys, java.lang.String[] values)
makeToken
in interface CoreTokenFactory<CoreLabel>
public CoreLabel makeToken(CoreLabel labelToBeCopied)
makeToken
in interface CoreTokenFactory<CoreLabel>