ConcatVector (Stanford JavaNLP API)

java.lang.Object
- edu.stanford.nlp.loglinear.model.ConcatVector

```
public class ConcatVector
extends java.lang.Object
```
Created on 12/7/14.

Author:

keenon
Implements a concat vector using an array of arrays, with all its attending resizing efficiencies, and double-pointer inefficiencies. Benchmarking from MinimalML (where I adapted this design from) shows that this is the most efficient of several strategies that can be used to implement this.
What is a ConcatVector? Why do I need it?
In short, you want this for online learning, where you may not know all your sparse features' sizes at initialization. A concat vector is a vector that behaves like a concatenation of smaller component vectors when you want a dot product. However, it never physically concatenates anything, it just dot products each component, and takes the sum. That way, if you need to expand a component during online learning, it's no problem. As an auxiliary benefit, you can specify sparse and dense components, greatly speeding up dot product calculation when you have lots of sparse features.

Constructor Summary

Constructors
Constructor and Description

ConcatVector(int numComponents)
Constructor that initializes space for this concat vector.

Constructors
Constructor and Description
`ConcatVector(int numComponents)` Constructor that initializes space for this concat vector.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`addVectorInPlace(ConcatVector other, double multiple)` This will add the vector "other" to this vector, scaling other by multiple.
`ConcatVector`	`deepClone()`
`double`	`dotProduct(ConcatVector other)` This function assumes both vectors are infinitely padded with 0s, so it won't complain if there's a dim mismatch.
`void`	`elementwiseProductInPlace(ConcatVector other)` This will multiply the vector "other" to this vector.
`double[]`	`getDenseComponent(int i)` This function will throw an assert if the component you're requesting isn't dense
`int`	`getNumberOfComponents()`
`ConcatVectorProto.ConcatVector.Builder`	`getProtoBuilder()`
`int`	`getSparseIndex(int component)` Gets you the index of one hot in a component, assuming it is sparse.
`double`	`getValueAt(int component, int offset)` This assumes infinite padding with 0s.
`boolean`	`isComponentSparse(int i)`
`void`	`mapInPlace(java.util.function.DoubleUnaryOperator fn)` Apply a function to every element of every component of this vector, and replace with the result.
`ConcatVector`	`newEmptyClone()` Creates a ConcatVector whose dimensions are the same as this one for all dense components, but is otherwise completely empty.
`static ConcatVector`	`readFromProto(ConcatVectorProto.ConcatVector m)` Recreates an in-memory concat vector object from a Proto serialization.
`static ConcatVector`	`readFromStream(java.io.InputStream stream)` Static function to deserialize a concat vector from an input stream.
`void`	`setDenseComponent(int component, double[] values)` Sets a single component of the concat vector value as a dense vector.
`void`	`setSparseComponent(int component, int index, double value)` Sets a single component of the concat vector value as a sparse, one hot value.
`java.lang.String`	`toString()`
`boolean`	`valueEquals(ConcatVector other, double tolerance)` Compares two concat vectors by value.
`void`	`writeToStream(java.io.OutputStream stream)` Writes the protobuf version of this vector to a stream.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - ConcatVector
```
public ConcatVector(int numComponents)
```
    Constructor that initializes space for this concat vector. Don't worry, it can resize individual elements as necessary but it's most efficient if you get this right at construction.
    
    Parameters:
    
    numComponents - The number of components (usually number of features) to allocate for.
- Method Detail
  - newEmptyClone
```
public ConcatVector newEmptyClone()
```
    Creates a ConcatVector whose dimensions are the same as this one for all dense components, but is otherwise completely empty. This is useful to prevent resizing during optimizations where we're adding lots of sparse vectors.
    
    Returns:
    
    an empty vector suitable for use as a gradient
  - setDenseComponent
```
public void setDenseComponent(int component,
                              double[] values)
```
    Sets a single component of the concat vector value as a dense vector. This will make a copy of you values array, so you're free to continue mutating it.
    
    Parameters:
    
    component - the index of the component to set
    
    values - the array of dense values to put into the component
  - setSparseComponent
```
public void setSparseComponent(int component,
                               int index,
                               double value)
```
    Sets a single component of the concat vector value as a sparse, one hot value.
    
    Parameters:
    
    component - the index of the component to set
    
    index - the index of the vector to one-hot
    
    value - the value of that index
  - dotProduct
```
public double dotProduct(ConcatVector other)
```
    This function assumes both vectors are infinitely padded with 0s, so it won't complain if there's a dim mismatch. There are no side effects.
    
    Parameters:
    
    other - the MV to dot product with
    
    Returns:
    
    the dot product of this and other
  - deepClone
```
public ConcatVector deepClone()
```
    Returns:
    
    a clone of this concat vector, with deep copies of datastructures
  - addVectorInPlace
```
public void addVectorInPlace(ConcatVector other,
                             double multiple)
```
    This will add the vector "other" to this vector, scaling other by multiple. In algebra,
    this = this + (other * multiple)
    The function assumes that both vectors are padded infinitely with 0s, so will scale this vector by adding components and changing component sizes (dense to bigger dense) and shapes (sparse to dense) in order to accommodate the result.
    
    Parameters:
    
    other - the vector to add to this one
    
    multiple - the multiple to use
  - elementwiseProductInPlace
```
public void elementwiseProductInPlace(ConcatVector other)
```
    This will multiply the vector "other" to this vector. It's the equivalent of the Matlab
    this = this .* other
    The function assumes that both vectors are padded infinitely with 0s, so will result in lots of 0s in this vector if it is longer than 'other'.
    
    Parameters:
    
    other - the vector to multiply into this one
  - mapInPlace
```
public void mapInPlace(java.util.function.DoubleUnaryOperator fn)
```
    Apply a function to every element of every component of this vector, and replace with the result.
    
    Parameters:
    
    fn - the function to apply to every element of every component.
  - getNumberOfComponents
```
public int getNumberOfComponents()
```
    Returns:
    
    the number of concatenated vectors that compose this ConcatVector
  - isComponentSparse
```
public boolean isComponentSparse(int i)
```
    Parameters:
    
    i - the index of the component to check
    
    Returns:
    
    whether component i is sparse or not
  - getDenseComponent
```
public double[] getDenseComponent(int i)
```
    This function will throw an assert if the component you're requesting isn't dense
    
    Parameters:
    
    i - the index of the component to look at
    
    Returns:
    
    the dense array composing that component
  - getValueAt
```
public double getValueAt(int component,
                         int offset)
```
    This assumes infinite padding with 0s. It will return you 0 if you're OOB (use getSegmentSizes() to check, if that's undesirable behavior). Otherwise it will return you the correct value.
    
    Parameters:
    
    component - the index of the component to retrieve a value from
    
    offset - the offset within that component
    
    Returns:
    
    the value retrieved, of 0 if OOB
  - getSparseIndex
```
public int getSparseIndex(int component)
```
    Gets you the index of one hot in a component, assuming it is sparse. Throws an assert if it isn't.
    
    Parameters:
    
    component - the index of the sparse component.
    
    Returns:
    
    the index of the one-hot value within that sparse component.
  - writeToStream
```
public void writeToStream(java.io.OutputStream stream)
                   throws java.io.IOException
```
    Writes the protobuf version of this vector to a stream. reversible with readFromStream().
    
    Parameters:
    
    stream - the output stream to write to
    
    Throws:
    
    java.io.IOException - passed through from the stream
  - readFromStream
```
public static ConcatVector readFromStream(java.io.InputStream stream)
                                   throws java.io.IOException
```
    Static function to deserialize a concat vector from an input stream.
    
    Parameters:
    
    stream - the stream to read from, assuming protobuf encoding
    
    Returns:
    
    a new concat vector
    
    Throws:
    
    java.io.IOException - passed through from the stream
  - getProtoBuilder
```
public ConcatVectorProto.ConcatVector.Builder getProtoBuilder()
```
    Returns:
    
    a Builder for proto serialization
  - readFromProto
```
public static ConcatVector readFromProto(ConcatVectorProto.ConcatVector m)
```
    Recreates an in-memory concat vector object from a Proto serialization.
    
    Parameters:
    
    m - the concat vector proto
    
    Returns:
    
    an in-memory concat vector object
  - valueEquals
```
public boolean valueEquals(ConcatVector other,
                           double tolerance)
```
    Compares two concat vectors by value. This means that we're 0 padding, so a dense and sparse component might both be considered the same, if the dense array reflects the same value as the sparse array. This is pretty much only useful for testing. Since it's primarily for testing, we went with the slower, more obviously correct design.
    
    Parameters:
    
    other - the vector we're comparing to
    
    tolerance - the amount any pair of values can differ before we say the two vectors are different.
    
    Returns:
    
    whether the two vectors are the same
  - toString
```
public java.lang.String toString()
```
    Overrides:
    
    toString in class java.lang.Object

Class ConcatVector

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

ConcatVector

Method Detail

newEmptyClone

setDenseComponent

setSparseComponent

dotProduct

deepClone

addVectorInPlace

elementwiseProductInPlace

mapInPlace

getNumberOfComponents

isComponentSparse

getDenseComponent

getValueAt

getSparseIndex

writeToStream

readFromStream

getProtoBuilder

readFromProto

valueEquals

toString