MentionExtractor (Stanford JavaNLP API)

java.lang.Object
- edu.stanford.nlp.dcoref.MentionExtractor

Direct Known Subclasses:

ACEMentionExtractor, CoNLLMentionExtractor, MUCMentionExtractor
```
public class MentionExtractor
extends java.lang.Object
```
Generic mention extractor from a corpus.

Author:

Jenny Finkel, Mihai Surdeanu, Karthik Raghunathan, Heeyoung Lee, Sudarshan Rangarajan

Field Summary

Fields
Modifier and Type	Field and Description
`protected java.lang.String`	`currentDocumentID`
`protected Dictionaries`	`dictionaries`
`protected int`	`maxID` The maximum mention ID: for preventing duplicated mention ID assignment
`CorefMentionFinder`	`mentionFinder`
`protected Semantics`	`semantics`
`protected LogisticClassifier<java.lang.String,java.lang.String>`	`singletonPredictor`
`protected StanfordCoreNLP`	`stanfordProcessor`
`static boolean`	`VERBOSE`

Constructor Summary

Constructors
Constructor and Description

MentionExtractor(Dictionaries dict, Semantics semantics)

Constructors
Constructor and Description
`MentionExtractor(Dictionaries dict, Semantics semantics)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Document`	`arrange(Annotation anno, java.util.List<java.util.List<CoreLabel>> words, java.util.List<Tree> trees, java.util.List<java.util.List<Mention>> unorderedMentions)`
`java.util.List<java.util.List<Mention>>`	`arrange(Annotation anno, java.util.List<java.util.List<CoreLabel>> words, java.util.List<Tree> trees, java.util.List<java.util.List<Mention>> unorderedMentions, boolean doMergeLabels)` Post-processes the extracted mentions.
`Document`	`arrange(Annotation anno, java.util.List<java.util.List<CoreLabel>> words, java.util.List<Tree> trees, java.util.List<java.util.List<Mention>> unorderedMentions, java.util.List<java.util.List<Mention>> unorderedGoldMentions, boolean doMergeLabels)`
`static Tree`	`findExactMatch(Tree tree, int first, int last)` Finds the tree the matches this span exactly
`protected int`	`getHeadIndex(Tree t)`
`static void`	`initializeUtterance(java.util.List<CoreLabel> tokens)`
`protected static StanfordCoreNLP`	`loadStanfordProcessor(java.util.Properties props)` Load Stanford Processor: skip unnecessary annotator
`static void`	`mergeLabels(Tree tree, java.util.List<CoreLabel> sentence)` Sets the label of the leaf nodes of a Tree to be the CoreLabels in the given sentence.
`Document`	`nextDoc()` Extracts the info relevant for coref from the next document in the corpus
`void`	`resetDocs()` Reset so that we start at the beginning of the document collection
`void`	`setMentionFinder(CorefMentionFinder mentionFinder)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

currentDocumentID

protected java.lang.String currentDocumentID

dictionaries

protected final Dictionaries dictionaries

semantics
```
protected final Semantics semantics
```

mentionFinder

public CorefMentionFinder mentionFinder

stanfordProcessor

protected StanfordCoreNLP stanfordProcessor

singletonPredictor

protected LogisticClassifier<java.lang.String,java.lang.String> singletonPredictor

maxID
```
protected int maxID
```
The maximum mention ID: for preventing duplicated mention ID assignment

VERBOSE
```
public static final boolean VERBOSE
```
See Also:

Constant Field Values

Constructor Detail

MentionExtractor

public MentionExtractor(Dictionaries dict,
                        Semantics semantics)

Method Detail

setMentionFinder

public void setMentionFinder(CorefMentionFinder mentionFinder)

nextDoc
```
public Document nextDoc()
                 throws java.lang.Exception
```
Extracts the info relevant for coref from the next document in the corpus

Returns:

List of mentions found in each sentence ordered according to the tree traversal.

Throws:

java.lang.Exception

resetDocs
```
public void resetDocs()
```
Reset so that we start at the beginning of the document collection

arrange

public Document arrange(Annotation anno,
                        java.util.List<java.util.List<CoreLabel>> words,
                        java.util.List<Tree> trees,
                        java.util.List<java.util.List<Mention>> unorderedMentions)
                 throws java.lang.Exception

Throws:: java.lang.Exception

getHeadIndex
```
protected int getHeadIndex(Tree t)
```

arrange

public Document arrange(Annotation anno,
                        java.util.List<java.util.List<CoreLabel>> words,
                        java.util.List<Tree> trees,
                        java.util.List<java.util.List<Mention>> unorderedMentions,
                        java.util.List<java.util.List<Mention>> unorderedGoldMentions,
                        boolean doMergeLabels)
                 throws java.lang.Exception

Throws:: java.lang.Exception

arrange
```
public java.util.List<java.util.List<Mention>> arrange(Annotation anno,
                                                       java.util.List<java.util.List<CoreLabel>> words,
                                                       java.util.List<Tree> trees,
                                                       java.util.List<java.util.List<Mention>> unorderedMentions,
                                                       boolean doMergeLabels)
                                                throws java.lang.Exception
```
Post-processes the extracted mentions. Here we set the Mention fields required for coref and order mentions by tree-traversal order.

Parameters:

words - List of words in each sentence, in textual order

trees - List of trees, one per sentence

unorderedMentions - List of unordered, unprocessed mentions Each mention MUST have startIndex and endIndex set! Optionally, if scoring is desired, mentions must have mentionID and originalRef set. All the other Mention fields are set here.

Returns:

List of mentions ordered according to the tree traversal

Throws:

java.lang.Exception

mergeLabels
```
public static void mergeLabels(Tree tree,
                               java.util.List<CoreLabel> sentence)
```
Sets the label of the leaf nodes of a Tree to be the CoreLabels in the given sentence. The original value() of the Tree nodes is preserved, and otherwise the label of tree leaves becomes the label from the List.

findExactMatch
```
public static Tree findExactMatch(Tree tree,
                                  int first,
                                  int last)
```
Finds the tree the matches this span exactly

Parameters:

tree - Leaves must be indexed!

first - First element in the span (first position has offset 1)

last - Last element included in the span (first position has offset 1)

loadStanfordProcessor

protected static StanfordCoreNLP loadStanfordProcessor(java.util.Properties props)

Load Stanford Processor: skip unnecessary annotator

initializeUtterance

public static void initializeUtterance(java.util.List<CoreLabel> tokens)

Class MentionExtractor

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

currentDocumentID

dictionaries

semantics

mentionFinder

stanfordProcessor

singletonPredictor

maxID

VERBOSE

Constructor Detail

MentionExtractor

Method Detail

setMentionFinder

nextDoc

resetDocs

arrange

getHeadIndex

arrange

arrange

mergeLabels

findExactMatch

loadStanfordProcessor

initializeUtterance