public class GenericDataSetReader
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
protected boolean |
calculateHeadSpan
If true, sets the head span to match the syntactic head of the extent.
|
protected boolean |
forceGenerationOfIndexSpans
If true, it regenerates the index spans for all tree nodes (useful for KBP)
|
protected HeadFinder |
headFinder
Finds the syntactic head of a syntactic constituent
|
protected java.util.logging.Logger |
logger
A logger for this class
|
protected Annotator |
parserProcessor
Additional NL processor that implements only syntactic parsing (needed for head detection)
We need this processor to detect heads of predicted entities that cannot be matched to an existing constituent.
|
protected boolean |
preProcessSentences
If true, we perform syntactic analysis of the dataset sentences and annotations
|
protected StanfordCoreNLP |
processor
NL processor to use for sentence pre-processing
|
protected boolean |
useNewHeadFinder
Only around for legacy results
|
Constructor and Description |
---|
GenericDataSetReader() |
GenericDataSetReader(StanfordCoreNLP processor,
boolean preProcessSentences,
boolean calculateHeadSpan,
boolean forceGenerationOfIndexSpans) |
Modifier and Type | Method and Description |
---|---|
int |
assignSyntacticHead(EntityMention ent,
Tree tree,
java.util.List<CoreLabel> tokens,
boolean setHeadSpan)
Find the index of the head of an entity.
|
static void |
convertToCoreLabels(Tree tree)
Converts the tree labels to CoreLabels.
|
Tree |
findSyntacticHead(EntityMention ent,
Tree root,
java.util.List<CoreLabel> tokens)
Finds the syntactic head of the given entity mention.
|
java.util.logging.Level |
getLoggerLevel() |
Annotator |
getParser() |
Tree |
originalFindSyntacticHead(EntityMention ent,
Tree root,
java.util.List<CoreLabel> tokens)
This is the original version of
findSyntacticHead(edu.stanford.nlp.ie.machinereading.structure.EntityMention, edu.stanford.nlp.trees.Tree, java.util.List<edu.stanford.nlp.ling.CoreLabel>) before Chris's modifications. |
protected Tree |
parse(java.util.List<CoreLabel> tokens) |
protected Tree |
parse(java.util.List<CoreLabel> tokens,
java.util.List<ParserConstraint> constraints) |
Annotation |
parse(java.lang.String path)
Parses one file or directory with data from one domain
|
protected Tree |
parseStrings(java.util.List<java.lang.String> tokens) |
void |
preProcessSentences(Annotation dataset)
Take a dataset Annotation, generate their parse trees and identify syntactic heads (and head spans, if necessary)
|
Annotation |
read(java.lang.String path) |
void |
setLoggerLevel(java.util.logging.Level level) |
void |
setProcessor(StanfordCoreNLP p) |
void |
setUseNewHeadFinder(boolean useNewHeadFinder) |
protected java.util.logging.Logger logger
protected final HeadFinder headFinder
protected StanfordCoreNLP processor
protected Annotator parserProcessor
protected final boolean preProcessSentences
protected final boolean calculateHeadSpan
protected final boolean forceGenerationOfIndexSpans
protected boolean useNewHeadFinder
public GenericDataSetReader()
public GenericDataSetReader(StanfordCoreNLP processor, boolean preProcessSentences, boolean calculateHeadSpan, boolean forceGenerationOfIndexSpans)
public void setProcessor(StanfordCoreNLP p)
public void setUseNewHeadFinder(boolean useNewHeadFinder)
public Annotator getParser()
public void setLoggerLevel(java.util.logging.Level level)
public java.util.logging.Level getLoggerLevel()
public final Annotation parse(java.lang.String path) throws java.io.IOException
path
- java.io.IOException
public Annotation read(java.lang.String path) throws java.lang.Exception
java.lang.Exception
public int assignSyntacticHead(EntityMention ent, Tree tree, java.util.List<CoreLabel> tokens, boolean setHeadSpan)
ent
- The entity mentiontree
- The Tree for the entire sentence in which it occurs.tokens
- The Sentence in which it occurssetHeadSpan
- Whether to set the head span in the entity mention.public void preProcessSentences(Annotation dataset)
public static void convertToCoreLabels(Tree tree)
tree
- public Tree findSyntacticHead(EntityMention ent, Tree root, java.util.List<CoreLabel> tokens)
ent
- The entity mentionroot
- The Tree for the entire sentence in which it occurs.tokens
- The Sentence in which it occurspublic Tree originalFindSyntacticHead(EntityMention ent, Tree root, java.util.List<CoreLabel> tokens)
findSyntacticHead(edu.stanford.nlp.ie.machinereading.structure.EntityMention, edu.stanford.nlp.trees.Tree, java.util.List<edu.stanford.nlp.ling.CoreLabel>)
before Chris's modifications.
There's no good reason to use it except for producing historical results.
It Finds the syntactic head of the given entity mention.ent
- The entity mentionroot
- The Tree for the entire sentence in which it occurs.tokens
- The Sentence in which it occursprotected Tree parseStrings(java.util.List<java.lang.String> tokens)
protected Tree parse(java.util.List<CoreLabel> tokens, java.util.List<ParserConstraint> constraints)