GenericDataSetReader (Stanford JavaNLP API)

java.lang.Object
- edu.stanford.nlp.ie.machinereading.GenericDataSetReader

Direct Known Subclasses:

AceReader, RothCONLL04Reader
```
public class GenericDataSetReader
extends java.lang.Object
```
Author:

Andrey Gusev, Mihai

Field Summary

Fields
Modifier and Type	Field and Description
`protected boolean`	`calculateHeadSpan` If true, sets the head span to match the syntactic head of the extent.
`protected boolean`	`forceGenerationOfIndexSpans` If true, it regenerates the index spans for all tree nodes (useful for KBP)
`protected HeadFinder`	`headFinder` Finds the syntactic head of a syntactic constituent
`protected java.util.logging.Logger`	`logger` A logger for this class
`protected Annotator`	`parserProcessor` Additional NL processor that implements only syntactic parsing (needed for head detection) We need this processor to detect heads of predicted entities that cannot be matched to an existing constituent.
`protected boolean`	`preProcessSentences` If true, we perform syntactic analysis of the dataset sentences and annotations
`protected StanfordCoreNLP`	`processor` NL processor to use for sentence pre-processing
`protected boolean`	`useNewHeadFinder` Only around for legacy results

Constructor Summary

Constructors
Constructor and Description
`GenericDataSetReader()`
`GenericDataSetReader(StanfordCoreNLP processor, boolean preProcessSentences, boolean calculateHeadSpan, boolean forceGenerationOfIndexSpans)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`int`	`assignSyntacticHead(EntityMention ent, Tree tree, java.util.List<CoreLabel> tokens, boolean setHeadSpan)` Find the index of the head of an entity.
`static void`	`convertToCoreLabels(Tree tree)` Converts the tree labels to CoreLabels.
`Tree`	`findSyntacticHead(EntityMention ent, Tree root, java.util.List<CoreLabel> tokens)` Finds the syntactic head of the given entity mention.
`java.util.logging.Level`	`getLoggerLevel()`
`Annotator`	`getParser()`
`Tree`	`originalFindSyntacticHead(EntityMention ent, Tree root, java.util.List<CoreLabel> tokens)` This is the original version of `findSyntacticHead(edu.stanford.nlp.ie.machinereading.structure.EntityMention, edu.stanford.nlp.trees.Tree, java.util.List<edu.stanford.nlp.ling.CoreLabel>)` before Chris's modifications.
`protected Tree`	`parse(java.util.List<CoreLabel> tokens)`
`protected Tree`	`parse(java.util.List<CoreLabel> tokens, java.util.List<ParserConstraint> constraints)`
`Annotation`	`parse(java.lang.String path)` Parses one file or directory with data from one domain
`protected Tree`	`parseStrings(java.util.List<java.lang.String> tokens)`
`void`	`preProcessSentences(Annotation dataset)` Take a dataset Annotation, generate their parse trees and identify syntactic heads (and head spans, if necessary)
`Annotation`	`read(java.lang.String path)`
`void`	`setLoggerLevel(java.util.logging.Level level)`
`void`	`setProcessor(StanfordCoreNLP p)`
`void`	`setUseNewHeadFinder(boolean useNewHeadFinder)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - logger
```
protected java.util.logging.Logger logger
```
    A logger for this class
  - headFinder
```
protected final HeadFinder headFinder
```
    Finds the syntactic head of a syntactic constituent
  - processor
```
protected StanfordCoreNLP processor
```
    NL processor to use for sentence pre-processing
  - parserProcessor
```
protected Annotator parserProcessor
```
    Additional NL processor that implements only syntactic parsing (needed for head detection) We need this processor to detect heads of predicted entities that cannot be matched to an existing constituent. This is created on demand, only when necessary
  - preProcessSentences
```
protected final boolean preProcessSentences
```
    If true, we perform syntactic analysis of the dataset sentences and annotations
  - calculateHeadSpan
```
protected final boolean calculateHeadSpan
```
    If true, sets the head span to match the syntactic head of the extent. Otherwise, the head span is not modified. This is enabled for the NFL domain, where head spans are not given.
  - forceGenerationOfIndexSpans
```
protected final boolean forceGenerationOfIndexSpans
```
    If true, it regenerates the index spans for all tree nodes (useful for KBP)
  - useNewHeadFinder
```
protected boolean useNewHeadFinder
```
    Only around for legacy results
- Constructor Detail
  - GenericDataSetReader
```
public GenericDataSetReader()
```
  - GenericDataSetReader
```
public GenericDataSetReader(StanfordCoreNLP processor,
                            boolean preProcessSentences,
                            boolean calculateHeadSpan,
                            boolean forceGenerationOfIndexSpans)
```
- Method Detail
  - setProcessor
```
public void setProcessor(StanfordCoreNLP p)
```
  - setUseNewHeadFinder
```
public void setUseNewHeadFinder(boolean useNewHeadFinder)
```
  - getParser
```
public Annotator getParser()
```
  - setLoggerLevel
```
public void setLoggerLevel(java.util.logging.Level level)
```
  - getLoggerLevel
```
public java.util.logging.Level getLoggerLevel()
```
  - parse
```
public final Annotation parse(java.lang.String path)
                       throws java.io.IOException
```
    Parses one file or directory with data from one domain
    
    Parameters:
    
    path -
    
    Throws:
    
    java.io.IOException
  - read
```
public Annotation read(java.lang.String path)
                throws java.lang.Exception
```
    Throws:
    
    java.lang.Exception
  - assignSyntacticHead
```
public int assignSyntacticHead(EntityMention ent,
                               Tree tree,
                               java.util.List<CoreLabel> tokens,
                               boolean setHeadSpan)
```
    Find the index of the head of an entity.
    
    Parameters:
    
    ent - The entity mention
    
    tree - The Tree for the entire sentence in which it occurs.
    
    tokens - The Sentence in which it occurs
    
    setHeadSpan - Whether to set the head span in the entity mention.
    
    Returns:
    
    The index of the entity head
  - preProcessSentences
```
public void preProcessSentences(Annotation dataset)
```
    Take a dataset Annotation, generate their parse trees and identify syntactic heads (and head spans, if necessary)
  - convertToCoreLabels
```
public static void convertToCoreLabels(Tree tree)
```
    Converts the tree labels to CoreLabels. We need this because we store additional info in the CoreLabel, like token span.
    
    Parameters:
    
    tree -
  - findSyntacticHead
```
public Tree findSyntacticHead(EntityMention ent,
                              Tree root,
                              java.util.List<CoreLabel> tokens)
```
    Finds the syntactic head of the given entity mention.
    
    Parameters:
    
    ent - The entity mention
    
    root - The Tree for the entire sentence in which it occurs.
    
    tokens - The Sentence in which it occurs
    
    Returns:
    
    The tree object corresponding to the head. This MUST be a child of root. It will be a leaf in the parse tree.
  - originalFindSyntacticHead
```
public Tree originalFindSyntacticHead(EntityMention ent,
                                      Tree root,
                                      java.util.List<CoreLabel> tokens)
```
    This is the original version of findSyntacticHead(edu.stanford.nlp.ie.machinereading.structure.EntityMention, edu.stanford.nlp.trees.Tree, java.util.List<edu.stanford.nlp.ling.CoreLabel>) before Chris's modifications. There's no good reason to use it except for producing historical results. It Finds the syntactic head of the given entity mention.
    
    Parameters:
    
    ent - The entity mention
    
    root - The Tree for the entire sentence in which it occurs.
    
    tokens - The Sentence in which it occurs
    
    Returns:
    
    The tree object corresponding to the head. This MUST be a child of root. It will be a leaf in the parse tree.
  - parseStrings
```
protected Tree parseStrings(java.util.List<java.lang.String> tokens)
```
  - parse
```
protected Tree parse(java.util.List<CoreLabel> tokens)
```
  - parse
```
protected Tree parse(java.util.List<CoreLabel> tokens,
                     java.util.List<ParserConstraint> constraints)
```

Class GenericDataSetReader

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

logger

headFinder

processor

parserProcessor

preProcessSentences

calculateHeadSpan

forceGenerationOfIndexSpans

useNewHeadFinder

Constructor Detail

GenericDataSetReader

GenericDataSetReader

Method Detail

setProcessor

setUseNewHeadFinder

getParser

setLoggerLevel

getLoggerLevel

parse

read

assignSyntacticHead

preProcessSentences

convertToCoreLabels

findSyntacticHead

originalFindSyntacticHead

parseStrings

parse

parse