edu.stanford.nlp.classify
Class NBLinearClassifierFactory

java.lang.Object
  extended by edu.stanford.nlp.classify.AbstractLinearClassifierFactory
      extended by edu.stanford.nlp.classify.NBLinearClassifierFactory

public class NBLinearClassifierFactory
extends AbstractLinearClassifierFactory

Provides a medium-weight implementation of Bernoulli (or binary) Naive Bayes via a linear classifier. It's medium weight in that it uses dense arrays for counts and calculation (but, hey, NB is efficient to estimate). Each features is treated as a independent binary variable.

CDM Jun 2003: I added a dirty trick so that if there is a feature that is always on in input examples, then its weight is turned into a prior feature! (This will work well iff it is also always on at test time.) In fact, this is done for each such feature, so by having several such features, one can even get an integral prior boost out of this.

Author:
Dan Klein

Constructor Summary
NBLinearClassifierFactory()
          Create a ClassifierFactory.
NBLinearClassifierFactory(double sigma)
          Create a ClassifierFactory.
NBLinearClassifierFactory(double sigma, boolean interpretAlwaysOnFeatureAsPrior)
          Create a ClassifierFactory.
 
Method Summary
 void setTuneSigmaCV(int folds)
          setTuneSigmaCV sets the tuneSigma flag: when turned on, the sigma is tuned by cross-validation.
protected  double[][] trainWeights(GeneralDataset data)
           
 
Methods inherited from class edu.stanford.nlp.classify.AbstractLinearClassifierFactory
trainClassifier, trainClassifier, trainClassifier
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NBLinearClassifierFactory

public NBLinearClassifierFactory()
Create a ClassifierFactory.


NBLinearClassifierFactory

public NBLinearClassifierFactory(double sigma)
Create a ClassifierFactory.

Parameters:
sigma - The amount of add-sigma smoothing of evidence

NBLinearClassifierFactory

public NBLinearClassifierFactory(double sigma,
                                 boolean interpretAlwaysOnFeatureAsPrior)
Create a ClassifierFactory.

Parameters:
sigma - The amount of add-sigma smoothing of evidence
interpretAlwaysOnFeatureAsPrior - If true, a feature that is in every data item is interpreted as an indication to include a prior factor over classes. (If there are multiple such features, an integral "prior boost" will occur.) If false, an always on feature is interpreted as an evidence feature (and, following the standard math) will have no effect on the model
Method Detail

trainWeights

protected double[][] trainWeights(GeneralDataset data)
Specified by:
trainWeights in class AbstractLinearClassifierFactory

setTuneSigmaCV

public void setTuneSigmaCV(int folds)
setTuneSigmaCV sets the tuneSigma flag: when turned on, the sigma is tuned by cross-validation. The number of folds is the parameter. If there is less data than the number of folds, leave-one-out is used. The default is false.



Stanford NLP Group