public static class Options.LexOptions
extends java.lang.Object
implements java.io.Serializable
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
DEFAULT_WORD_VECTOR_FILE
RS: file for Turian's word vectors
The default value is an example of size 25 word vectors on the nlp machines
|
boolean |
flexiTag |
int |
numHid
Number of hidden units in the word vectors.
|
boolean |
smartMutation
Smarter smoothing for rare words.
|
int |
smoothInUnknownsThreshold
Words more common than this are tagged with MLE P(t|w).
|
int |
unknownPrefixSize
For certain Lexicons, a certain number of word-initial letters are
used to subclassify the unknown token.
|
int |
unknownSuffixSize
For certain Lexicons, a certain number of word-final letters are
used to subclassify the unknown token.
|
boolean |
useSignatureForKnownSmoothing
Whether to use signature rather than just being unknown as prior in
known word smoothing.
|
boolean |
useUnicodeType
Make use of unicode code point types in smoothing.
|
int |
useUnknownWordSignatures
Whether to use suffix and capitalization information for unknowns.
|
java.lang.String |
uwModelTrainer
Model for unknown words that the lexicon should use.
|
java.lang.String |
wordClassesFile
A file of word class data which may be used for smoothing,
normally instead of hand-specified signatures.
|
java.lang.String |
wordVectorFile |
Constructor and Description |
---|
LexOptions() |
Modifier and Type | Method and Description |
---|---|
void |
readData(java.io.BufferedReader in) |
java.lang.String |
toString() |
public int useUnknownWordSignatures
public static final java.lang.String DEFAULT_WORD_VECTOR_FILE
public java.lang.String wordVectorFile
public int numHid
public int smoothInUnknownsThreshold
public boolean smartMutation
public boolean useUnicodeType
public int unknownSuffixSize
public int unknownPrefixSize
public java.lang.String uwModelTrainer
public boolean flexiTag
public boolean useSignatureForKnownSmoothing
public java.lang.String wordClassesFile