WordsToSentencesAnnotator (Stanford JavaNLP API)

java.lang.Object
- edu.stanford.nlp.pipeline.WordsToSentencesAnnotator

All Implemented Interfaces:

Annotator
```
public class WordsToSentencesAnnotator
extends java.lang.Object
implements Annotator
```
This class assumes that there is a List<CoreLabel> under the TokensAnnotation field, and runs it through WordToSentenceProcessor and puts the new List<Annotation> under the SentencesAnnotation field.

Author:

Jenny Finkel, Christopher Manning

Constructor Summary

Constructors
Constructor and Description
`WordsToSentencesAnnotator()`
`WordsToSentencesAnnotator(boolean verbose)`
`WordsToSentencesAnnotator(boolean verbose, java.lang.String boundaryTokenRegex, java.util.Set<java.lang.String> boundaryToDiscard, java.util.Set<java.lang.String> htmlElementsToDiscard, java.lang.String newlineIsSentenceBreak, java.lang.String boundaryMultiTokenRegex, java.util.Set<java.lang.String> tokenRegexesToDiscard)`
`WordsToSentencesAnnotator(java.util.Properties properties)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`annotate(Annotation annotation)` If setCountLineNumbers is set to true, we count line numbers by telling the underlying splitter to return empty lists of tokens and then treating those empty lists as empty lines.
`static WordsToSentencesAnnotator`	`newlineSplitter(java.lang.String... nlToken)` Return a WordsToSentencesAnnotator that splits on newlines (only), which are then deleted.
`static WordsToSentencesAnnotator`	`nonSplitter()` Return a WordsToSentencesAnnotator that never splits the token stream.
`java.util.Set<java.lang.Class<? extends CoreAnnotation>>`	`requirementsSatisfied()` Returns a set of requirements for which tasks this annotator can provide.
`java.util.Set<java.lang.Class<? extends CoreAnnotation>>`	`requires()` Returns the set of tasks which this annotator requires in order to perform.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface edu.stanford.nlp.pipeline.Annotator
exactRequirements, unmount

- Constructor Detail
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator()
```
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator(java.util.Properties properties)
```
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator(boolean verbose)
```
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator(boolean verbose,
                                 java.lang.String boundaryTokenRegex,
                                 java.util.Set<java.lang.String> boundaryToDiscard,
                                 java.util.Set<java.lang.String> htmlElementsToDiscard,
                                 java.lang.String newlineIsSentenceBreak,
                                 java.lang.String boundaryMultiTokenRegex,
                                 java.util.Set<java.lang.String> tokenRegexesToDiscard)
```
- Method Detail
  - newlineSplitter
```
public static WordsToSentencesAnnotator newlineSplitter(java.lang.String... nlToken)
```
    Return a WordsToSentencesAnnotator that splits on newlines (only), which are then deleted. This constructor counts the lines by putting in empty token lists for empty lines. It tells the underlying splitter to return empty lists of tokens and then treats those empty lists as empty lines. We don't actually include empty sentences in the annotation, though. But they are used in numbering the sentence. Only this constructor leads to empty sentences.
    
    Parameters:
    
    nlToken - Zero or more new line tokens, which might be a \n or the fake newline tokens returned from the tokenizer.
    
    Returns:
    
    A WordsToSentenceAnnotator.
  - nonSplitter
```
public static WordsToSentencesAnnotator nonSplitter()
```
    Return a WordsToSentencesAnnotator that never splits the token stream. You just get one sentence.
    
    Returns:
    
    A WordsToSentenceAnnotator.
  - annotate
```
public void annotate(Annotation annotation)
```
    If setCountLineNumbers is set to true, we count line numbers by telling the underlying splitter to return empty lists of tokens and then treating those empty lists as empty lines. We don't actually include empty sentences in the annotation, though.
    
    Specified by:
    
    annotate in interface Annotator
  - requires
```
public java.util.Set<java.lang.Class<? extends CoreAnnotation>> requires()
```
    Description copied from interface: Annotator
    
    Returns the set of tasks which this annotator requires in order to perform. For example, the POS annotator will return "tokenize", "ssplit".
    
    Specified by:
    
    requires in interface Annotator
  - requirementsSatisfied
```
public java.util.Set<java.lang.Class<? extends CoreAnnotation>> requirementsSatisfied()
```
    Description copied from interface: Annotator
    
    Returns a set of requirements for which tasks this annotator can provide. For example, the POS annotator will return "pos".
    
    Specified by:
    
    requirementsSatisfied in interface Annotator

Class WordsToSentencesAnnotator

Field Summary

Fields inherited from interface edu.stanford.nlp.pipeline.Annotator

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface edu.stanford.nlp.pipeline.Annotator

Constructor Detail

WordsToSentencesAnnotator

WordsToSentencesAnnotator

WordsToSentencesAnnotator

WordsToSentencesAnnotator

Method Detail

newlineSplitter

nonSplitter

annotate

requires

requirementsSatisfied