public class ParentAnnotationStats extends java.lang.Object implements TreeVisitor
Modifier and Type | Field and Description |
---|---|
static double[] |
CUTOFFS
Minimum support * KL to be included in output and as feature
|
static double |
SUPPCUTOFF
Minimum support of parent annotated node for grandparent to be
studied.
|
Modifier and Type | Method and Description |
---|---|
static java.util.Set<java.lang.String> |
getEnglishSplitCategories(java.lang.String treebankRoot)
This is hardwired to calculate the split categories from English
Penn Treebank sections 2-21 with a default cutoff of 300 (as used
in ACL03PCFG).
|
static java.util.Set<java.lang.String> |
getSplitCategories(Treebank t,
boolean doTags,
int algorithm,
double phrasalCutOff,
double tagCutOff,
TreebankLanguagePack tlp)
Call this method to get a String array of categories to split on.
|
static java.util.Set<java.lang.String> |
getSplitCategories(Treebank t,
double cutOff,
TreebankLanguagePack tlp)
Call this method to get a String array of categories to split on.
|
static java.util.List<java.lang.String> |
kidLabels(Tree t) |
static void |
main(java.lang.String[] args)
Calculate parent annotation statistics suitable for doing
selective parent splitting in the PCFGParser inside
FactoredParser.
|
void |
printStats() |
void |
processTreeHelper(java.lang.String gP,
java.lang.String p,
Tree t) |
void |
visitTree(Tree t)
Does whatever one needs to do to a particular parse tree
|
public static final double[] CUTOFFS
public static final double SUPPCUTOFF
public void visitTree(Tree t)
visitTree
in interface TreeVisitor
t
- A tree. Classes implementing this interface can assume
that the tree passed in is not null
.public static java.util.List<java.lang.String> kidLabels(Tree t)
public void processTreeHelper(java.lang.String gP, java.lang.String p, Tree t)
public void printStats()
public static void main(java.lang.String[] args)
Usage: java edu.stanford.nlp.parser.lexparser.ParentAnnotationStats [-tags] treebankPath
args
- One argument: path to the Treebankpublic static java.util.Set<java.lang.String> getSplitCategories(Treebank t, double cutOff, TreebankLanguagePack tlp)
If tlp is non-null tlp.basicCategory() will be called on parent and grandparent nodes.
This version just defaults some parameters. Implementation note: This method is not designed for concurrent invocation: it uses static state variables.
public static java.util.Set<java.lang.String> getSplitCategories(Treebank t, boolean doTags, int algorithm, double phrasalCutOff, double tagCutOff, TreebankLanguagePack tlp)
If tlp is non-null tlp.basicCategory() will be called on parent and grandparent nodes.
Implementation note: This method is not designed for concurrent invocation: it uses static state variables.
public static java.util.Set<java.lang.String> getEnglishSplitCategories(java.lang.String treebankRoot)