edu.stanford.nlp.classify
Class LinearClassifier

java.lang.Object
  extended by edu.stanford.nlp.classify.LinearClassifier
All Implemented Interfaces:
Classifier, ProbabilisticClassifier, RVFClassifier, Serializable

public class LinearClassifier
extends Object
implements ProbabilisticClassifier, RVFClassifier

Implements a multiclass linear classifier. At classification time this can be any generalized linear model classifier (such as a perceptron, naive logistic regression, SVM).

Author:
Dan Klein, Jenny Finkel, Galen Andrew (converted to arrays and indices), Christopher Manning (most of the printing options), Eric Yeh (save to text file, new constructor w/thresholds)
See Also:
Serialized Form

Field Summary
 boolean intern
           
static String TEXT_SERIALIZATION_DELIMITER
           
 
Constructor Summary
LinearClassifier(Counter<Pair> weightCounter)
           
LinearClassifier(Counter<Pair> weightCounter, Counter thresholdsC)
           
LinearClassifier(double[][] weights, Index featureIndex, Index labelIndex)
           
LinearClassifier(double[][] weights, Index featureIndex, Index labelIndex, double[] thresholds)
           
 
Method Summary
 void adaptWeights(Dataset adapt, LinearClassifierFactory lcf)
           
 Object classOf(Datum example)
           
 Object classOf(RVFDatum example)
           
 void dump()
          Print all features in the classifier and the weight that they assign to each class.
 void dumpSorted()
          Print all features in the classifier and the weight that they assign to each class.
 Index featureIndex()
           
 Collection<Object> features()
           
 void justificationOf(Datum example)
           
 void justificationOf(Datum example, PrintWriter pw)
          Print all features active for a particular datum and the weight that the classifier assigns to each class for those features.
 void justificationOf(Datum example, PrintWriter pw, boolean sorted)
          Print all features active for a particular datum and the weight that the classifier assigns to each class for those features.
 void justificationOf(Datum example, PrintWriter pw, Function printer)
           
 void justificationOf(Datum example, PrintWriter pw, Function printer, boolean sortedByFeature)
          Print all features active for a particular datum and the weight that the classifier assigns to each class for those features.
 void justificationOf(RVFDatum example)
           
 void justificationOf(RVFDatum example, PrintWriter pw)
          Print all features active for a particular datum and the weight that the classifier assigns to each class for those features.
 Index<Object> labelIndex()
           
 Collection<Object> labels()
           
 Counter logProbabilityOf(Datum example)
          Returns a counter mapping from each class name to the log probability of that class for a certain example.
 Counter logProbabilityOf(RVFDatum example)
          Returns a counter for the log probability of each of the classes looking at the the sum of e^v for each count v, should be 1
 Counter probabilityOf(Datum example)
          Returns a counter mapping from each class name to the probability of that class for a certain example.
 Counter probabilityOf(RVFDatum example)
          Returns a counter mapping from each class name to the probability of that class for a certain example.
static LinearClassifier readClassifier(String loadPath)
           
 void saveToFilename(String file)
          Saves this out to a standard text file, instead of as a serialized Java object.
 double scoreOf(Datum example, Object label)
          Returns of the score of the Datum for the specified label.
 double scoreOf(RVFDatum example, Object label)
          Returns the score of the RVFDatum for the specified label.
 Counter scoresOf(Datum example)
          Construct a counter with keys the labels of the classifier and values the score (unnormalized log probability) of each class.
 Counter scoresOf(Datum example, Collection possibleLabels)
           
 Counter scoresOf(RVFDatum example)
          Construct a counter with keys the labels of the classifier and values the score (unnormalized log probability) of each class for an RVFDatum.
 void setWeights(double[][] newWeights)
           
 String toAllWeightsString()
           
 String toBiggestWeightFeaturesString(boolean useMagnitude, int numFeatures, boolean printDescending)
          Return a String that prints features with large weights.
 String toDistributionString(int treshold)
          Similar to histogram but exact values of the weights to see whether there are many equal weights.
 String toHistogramString()
           
 String toString()
          Print out a partial representation of a linear classifier.
 String toString(String style, int param)
          Print out a partial representation of a linear classifier in one of several ways.
 int totalSize()
           
 double weight(Object feature, Object label)
           
 double[][] weights()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

intern

public boolean intern

TEXT_SERIALIZATION_DELIMITER

public static final String TEXT_SERIALIZATION_DELIMITER
See Also:
Constant Field Values
Constructor Detail

LinearClassifier

public LinearClassifier(double[][] weights,
                        Index featureIndex,
                        Index labelIndex)

LinearClassifier

public LinearClassifier(double[][] weights,
                        Index featureIndex,
                        Index labelIndex,
                        double[] thresholds)
                 throws Exception
Throws:
Exception

LinearClassifier

public LinearClassifier(Counter<Pair> weightCounter)

LinearClassifier

public LinearClassifier(Counter<Pair> weightCounter,
                        Counter thresholdsC)
Method Detail

labels

public Collection<Object> labels()
Specified by:
labels in interface Classifier

features

public Collection<Object> features()

labelIndex

public Index<Object> labelIndex()

featureIndex

public Index featureIndex()

weight

public double weight(Object feature,
                     Object label)

scoresOf

public Counter scoresOf(Datum example)
Construct a counter with keys the labels of the classifier and values the score (unnormalized log probability) of each class.

Specified by:
scoresOf in interface Classifier

scoreOf

public double scoreOf(Datum example,
                      Object label)
Returns of the score of the Datum for the specified label. Ignores the true label of the Datum.


scoresOf

public Counter scoresOf(RVFDatum example)
Construct a counter with keys the labels of the classifier and values the score (unnormalized log probability) of each class for an RVFDatum.

Specified by:
scoresOf in interface RVFClassifier

scoreOf

public double scoreOf(RVFDatum example,
                      Object label)
Returns the score of the RVFDatum for the specified label. Ignores the true label of the RVFDatum.


probabilityOf

public Counter probabilityOf(Datum example)
Returns a counter mapping from each class name to the probability of that class for a certain example. Looking at the the sum of each count v, should be 1.0.

Specified by:
probabilityOf in interface ProbabilisticClassifier

probabilityOf

public Counter probabilityOf(RVFDatum example)
Returns a counter mapping from each class name to the probability of that class for a certain example. Looking at the the sum of each count v, should be 1.0.


logProbabilityOf

public Counter logProbabilityOf(Datum example)
Returns a counter mapping from each class name to the log probability of that class for a certain example. Looking at the the sum of e^v for each count v, should be 1.0.

Specified by:
logProbabilityOf in interface ProbabilisticClassifier

logProbabilityOf

public Counter logProbabilityOf(RVFDatum example)
Returns a counter for the log probability of each of the classes looking at the the sum of e^v for each count v, should be 1


toBiggestWeightFeaturesString

public String toBiggestWeightFeaturesString(boolean useMagnitude,
                                            int numFeatures,
                                            boolean printDescending)
Return a String that prints features with large weights.

Parameters:
useMagnitude - Whether the notion of "large" should ignore the sign of the feature weight.
numFeatures - How many top features to print
Returns:
The String representation of features with large weights

toDistributionString

public String toDistributionString(int treshold)
Similar to histogram but exact values of the weights to see whether there are many equal weights.

Returns:
A human readable string about the classifier distribution.

totalSize

public int totalSize()

toHistogramString

public String toHistogramString()

toString

public String toString()
Print out a partial representation of a linear classifier. This just calls toString("WeightHistogram", 0)

Overrides:
toString in class Object

toString

public String toString(String style,
                       int param)
Print out a partial representation of a linear classifier in one of several ways.

Parameters:
style - Options are: HighWeight: print out the param parameters with largest weights; HighMagnitude: print out the param parameters for which the absolute value of their weight is largest; AllWeights: print out the weights of all features WeightHistogram: print out a particular hard-coded textual histogram representation of a classifier
param - Determines the number of things printed in certain styles
Throws:
IllegalArgumentException - if the style name is unrecognized

toAllWeightsString

public String toAllWeightsString()

dump

public void dump()
Print all features in the classifier and the weight that they assign to each class.


justificationOf

public void justificationOf(RVFDatum example)

justificationOf

public void justificationOf(RVFDatum example,
                            PrintWriter pw)
Print all features active for a particular datum and the weight that the classifier assigns to each class for those features.


justificationOf

public void justificationOf(Datum example)

justificationOf

public void justificationOf(Datum example,
                            PrintWriter pw,
                            Function printer)

justificationOf

public void justificationOf(Datum example,
                            PrintWriter pw,
                            Function printer,
                            boolean sortedByFeature)
Print all features active for a particular datum and the weight that the classifier assigns to each class for those features.

Parameters:
example - The datum for which features are to be printed
pw - Where to print it to
printer - If this is non-null, then it is applied to each feature to convert it to a more readable form
sortedByFeature - Whether to sort by feature names

justificationOf

public void justificationOf(Datum example,
                            PrintWriter pw)
Print all features active for a particular datum and the weight that the classifier assigns to each class for those features.


dumpSorted

public void dumpSorted()
Print all features in the classifier and the weight that they assign to each class. The feature names are printed in sorted order.


justificationOf

public void justificationOf(Datum example,
                            PrintWriter pw,
                            boolean sorted)
Print all features active for a particular datum and the weight that the classifier assigns to each class for those features. Sorts by feature name if 'sorted' is true.


scoresOf

public Counter scoresOf(Datum example,
                        Collection possibleLabels)

classOf

public Object classOf(Datum example)
Specified by:
classOf in interface Classifier

classOf

public Object classOf(RVFDatum example)
Specified by:
classOf in interface RVFClassifier

adaptWeights

public void adaptWeights(Dataset adapt,
                         LinearClassifierFactory lcf)

weights

public double[][] weights()

setWeights

public void setWeights(double[][] newWeights)

readClassifier

public static LinearClassifier readClassifier(String loadPath)

saveToFilename

public void saveToFilename(String file)
Saves this out to a standard text file, instead of as a serialized Java object. NOTE: this currently assumes feature and weights are represented as Strings.

Parameters:
file - String filepath to write out to.


Stanford NLP Group