|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object pt.tumba.ngram.svm.SVM
public class SVM
Construct and solve various formulations of the support vector machine (SVM) problem.
A support vector machine is a supervised learning algorithm developed over the past decade by Vapnik and others (Vapnik, Statistical Learning Theory, 1998). The algorithm addresses the general problem of learning to discriminate between positive and negative members of a given class of n-dimensional vectors.
The SVM algorithm operates by mapping the given training set into a possibly high-dimensional feature space and attempting to locate in that space a plane that separates the positive from the negative examples. Having found such a plane, the SVM can then predict the classification of an unlabeled example by mapping it into the feature space and asking on which side of the separating plane the example lies. Much of the SVM's power comes from its criterion for selecting a separating plane when many candidates planes exist: the SVM chooses the plane that maintains a maximum margin from any point in the training set. Statistical learning theory suggests that, for some classes of well-behaved data, the choice of the maximum margin hyperplane will lead to maximal generalization when predicting the classification of previously unseen examples (Vapnik, Statistical Learning Theory, 1998). The SVM algorithm can also be extended to cope with noise in the training set and with multiple classes (Cristianini and Shawe-Taylor, An Introduction to Support Vector Machines, 2000).
Say that we have a training data set containing n examples, each of which is a vector of m numbers. These vectors may be thought of as points in an m-dimensional space. In theory, a simple way to build a binary classifier is to construct a hyperplane (i.e., a plane in a space with more than three dimensions) separating class members (positive examples) from non-members (negative examples) in this space. Unfortunately, most real-world problems involve non-separable data for which there does not exist a hyperplane that successfully separates the positive from the negative examples. One solution to the inseparability problem is to map the data into a higher-dimensional space and define a separating hyperplane there. This higher-dimensional space is called the feature space, as opposed to the input space occupied by the training examples. With an appropriately chosen feature space of sufficient dimensionality, any consistent training set can be made separable. However, translating the training set into a higher-dimensional space incurs both computational and learning-theoretic costs. Furthermore, artificially separating the data in this way exposes the learning system to the risk of finding trivial solutions that overfit the data.
SVMs elegantly sidestep both difficulties. They avoid overfitting by choosing the maximum margin separating hyperplane from among the many that can separate the positive from negative examples in the feature space. Also, the decision function for classifying points with respect to the hyperplane only involves dot products between points in the feature space. Because the algorithm that finds a separating hyperplane in the feature space can be stated entirely in terms of vectors in the input space and dot products in the feature space, a support vector machine can locate the hyperplane without ever representing the space explicitly, simply by defining a function, called a kernel function, that plays the role of the dot product in the feature space. This technique avoids the computational burden of explicitly representing the feature vectors.
For some data sets, the SVM may not be able to find a separating hyperplane in feature space, either because the kernel function is inappropriate for the training data or because the data contains mislabeled examples. The latter problem can be addressed by using a soft margin that allows some training examples to fall on the wrong side of the separating hyperplane. Completely specifying a support vector machine therefore requires specifying two parameters: the kernel function and the magnitude of the penalty for violating the soft margin. The settings of these parameters depend on the specific data at hand.
Given an expression vector X for each example, the simplest kernel K(X,Y) that we can use to measure the similarity between vectors X and Y is the dot product in the input space K(X,Y) = X ° Y. For technical reasons (namely, that our optimization algorithm requires that the hyperplane pass through the origin), we typically add a constant to this kernel, obtaining a kernel such as by K(X,Y) = X ° Y +1. When this dot product kernel is used, the feature space is essentially the same as the m-dimensional input space, and the SVM will classify the examples with a separating hyperplane in this space. Squaring this kernel, i.e. defining K(X,Y) = (X ° Y +1)2, yields a quadratic separating surface in the input space. The corresponding separating hyperplane in the feature space includes features for all pairwise interactions Xi Xj between features, where 1 <= i, j <= m. Raising the kernel to higher powers yields polynomial separating surfaces of higher degrees in the input space. In general, the kernel of degree d is defined by K(X,Y) = (X ° Y +1)d. In the feature space of this kernel, for any example X there are features for all d-fold interactions between features.Also commonly employed is the radial basis kernel (Scholkopf et al., IEEE Trans Sig Proc 45(11):2758-2765), which has a Gaussian form K(X,Y) = exp(-||X - Y||2 / 2 s2), where s is the width of the Gaussian. In the SVM, s is usually set equal to the median of the Euclidean distances from each positive example to the nearest negative example.
In many real-world problems, one of the two classes examined contains very few members relative to the other class. This imbalance in the number of positive and negative training examples, in combination with noise in the data, is likely to cause the SVM to make incorrect classifications. When the magnitude of the noise in the negative examples outweighs the total number of positive examples, the optimal hyperplane located by the SVM will be uninformative, classifying all members of the training set as negative examples. We combat this problem by modifying the matrix of kernel values computed during SVM optimization. Let X(1), ..., X(n) be the examples in the training set, and let K be the matrix defined by the kernel function K on this training set; i.e., Kij = K(X(i),X(j)). By adding to the diagonal of the kernel matrix a constant whose magnitude depends on the class of the training example, one can control the fraction of misclassified points in the two classes. This technique ensures that the positive points are not regarded as noisy labels. For positive examples, the diagonal element is modified by Kij := Kij + f*r*(n+/N), where n+ is the number of positive training examples, N is the total number of training examples, f is a user-specified scale factor, and r is the median diagonal kernel element. A similar formula is used for the negative examples, with n+ replaced by n-.
A more mathematically detailed discussion of these techniques is available at http://www.cse.ucsc.edu/research/compbio/genex.
Nested Class Summary | |
---|---|
(package private) static class |
SVM.decisionFunction
Inner class modeling the data for the SVM decision function, used for classifying points with respect to the hyperplane. |
Field Summary | |
---|---|
(package private) static java.lang.String[] |
kernelTypeTable
An array with all the kernel type names. |
(package private) static java.lang.String[] |
svmTypeTable
Table with all the SVM problem formulation names. |
Constructor Summary | |
---|---|
SVM()
|
Method Summary | |
---|---|
private static void |
multiclassProbability(int k,
double[][] r,
double[] p)
Method 2 from the multiclass_prob paper by Wu, Lin, and Weng |
private static double |
sigmoidPredict(double decision_value,
double A,
double B)
|
private static void |
sigmoidTrain(int l,
double[] dec_values,
double[] labels,
double[] probAB)
Platt's binary SVM Probablistic Output: an improvement from Lin et al. |
private static void |
solveCSVC(SVMProblem prob,
SVMParameter param,
double[] alpha,
Solver.SolutionInfo si,
double Cp,
double Cn)
Solve the C-Support Vector classification problem. |
private static void |
solveEpsilonSVR(SVMProblem prob,
SVMParameter param,
double[] alpha,
Solver.SolutionInfo si)
Solve the Epsilon-Support Vector Regression Problem |
private static void |
solveNUSVC(SVMProblem prob,
SVMParameter param,
double[] alpha,
Solver.SolutionInfo si)
Solve the Nu-Support Vector Classification problem. |
private static void |
solveNUSVR(SVMProblem prob,
SVMParameter param,
double[] alpha,
Solver.SolutionInfo si)
Solve the Nu-Support Vector Regression Problem |
private static void |
solveOneClass(SVMProblem prob,
SVMParameter param,
double[] alpha,
Solver.SolutionInfo si)
Solve the SVM Distribution Estimation Problem. |
private static void |
svmBinarySVCProbability(SVMProblem prob,
SVMParameter param,
double Cp,
double Cn,
double[] probAB)
Cross-validation decision values for probability estimates. |
static java.lang.String |
svmCheckParameter(SVMProblem prob,
SVMParameter param)
Check the parameters given to an SVM problem. |
static int |
svmCheckProbabilityModel(SVMModel model)
|
static void |
svmCrossValidation(SVMProblem prob,
SVMParameter param,
int nr_fold,
double[] target)
Perform cross validation |
static void |
svmGetLabels(SVMModel model,
int[] label)
|
static double |
svmGetSVRProbability(SVMModel model)
|
static SVMModel |
svmLoadModel(java.lang.String model_file_name)
Loads a trained SVM model from disk. |
static double |
svmPredict(SVMModel model,
SVMNode[] x)
|
static double |
svmPredictProbability(SVMModel model,
SVMNode[] x,
double[] prob_estimates)
|
static void |
svmPredictValues(SVMModel model,
SVMNode[] x,
double[] dec_values)
|
static void |
svmSaveModel(java.lang.String model_file_name,
SVMModel model)
Outputs a trained model to disk, for use in future classification tasks. |
private static double |
svmSVRProbability(SVMProblem prob,
SVMParameter param)
Return parameter of a Laplace distribution |
static SVMModel |
svmTrain(SVMProblem prob,
SVMParameter param)
Train the SVM. |
(package private) static SVM.decisionFunction |
svmTrainOne(SVMProblem prob,
SVMParameter param,
double Cp,
double Cn)
Computation for one step of the training procedure. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
static final java.lang.String[] svmTypeTable
static final java.lang.String[] kernelTypeTable
Constructor Detail |
---|
public SVM()
Method Detail |
---|
private static void solveCSVC(SVMProblem prob, SVMParameter param, double[] alpha, Solver.SolutionInfo si, double Cp, double Cn)
prob
- The SVM Problem.param
- The parameters.alpha
- The SVM alpha values (will be filled by learning).si
- Information about the solution (used for returning multiple values).Cp
- Cn
- private static void solveNUSVC(SVMProblem prob, SVMParameter param, double[] alpha, Solver.SolutionInfo si)
prob
- The SVM Problem.param
- The parameters.alpha
- The SVM alpha values (will be filled by learning).si
- Information about the solution (used for returning multiple values).private static void solveOneClass(SVMProblem prob, SVMParameter param, double[] alpha, Solver.SolutionInfo si)
prob
- The SVM Problem.param
- The parameters.alpha
- The SVM alpha values (will be filled by learning).si
- Information about the solution (used for returning multiple values).private static void solveEpsilonSVR(SVMProblem prob, SVMParameter param, double[] alpha, Solver.SolutionInfo si)
prob
- The SVM Problem.param
- The parameters.alpha
- The SVM alpha values (will be filled by learning).si
- Information about the solution (used for returning multiple values).private static void solveNUSVR(SVMProblem prob, SVMParameter param, double[] alpha, Solver.SolutionInfo si)
prob
- The SVM Problem.param
- The parameters.alpha
- The SVM alpha values (will be filled by learning).si
- Information about the solution (used for returning multiple values).static SVM.decisionFunction svmTrainOne(SVMProblem prob, SVMParameter param, double Cp, double Cn)
prob
- The SVM Problemparam
- The ParametersCp
- Cn
-
private static void sigmoidTrain(int l, double[] dec_values, double[] labels, double[] probAB)
l
- dec_values
- labels
- probAB
- private static double sigmoidPredict(double decision_value, double A, double B)
decision_value
- A
- B
-
private static void multiclassProbability(int k, double[][] r, double[] p)
k
- r
- p
- private static void svmBinarySVCProbability(SVMProblem prob, SVMParameter param, double Cp, double Cn, double[] probAB)
prob
- param
- Cp
- Cn
- probAB
- private static double svmSVRProbability(SVMProblem prob, SVMParameter param)
prob
- param
-
public static SVMModel svmTrain(SVMProblem prob, SVMParameter param)
prob
- The SVM Problem.param
- The parameters.
public static void svmCrossValidation(SVMProblem prob, SVMParameter param, int nr_fold, double[] target)
prob
- The SVM problem.param
- The parameters.nr_fold
- Number of folds for the cross validation.target
- public static void svmGetLabels(SVMModel model, int[] label)
model
- label
- public static double svmGetSVRProbability(SVMModel model)
model
-
public static void svmPredictValues(SVMModel model, SVMNode[] x, double[] dec_values)
model
- x
- dec_values
- public static double svmPredict(SVMModel model, SVMNode[] x)
model
- x
-
public static double svmPredictProbability(SVMModel model, SVMNode[] x, double[] prob_estimates)
model
- x
- prob_estimates
-
public static void svmSaveModel(java.lang.String model_file_name, SVMModel model) throws java.io.IOException
model_file_name
- The name of the file storing the model.model
- The SVM model.
java.io.IOException
- A problem occured while storing the model.public static SVMModel svmLoadModel(java.lang.String model_file_name) throws java.io.IOException
model_file_name
- The path name for the file containing the model.
java.io.IOException
- A problem occured while reading the model file.public static java.lang.String svmCheckParameter(SVMProblem prob, SVMParameter param)
prob
- An SVM Problem.param
- The parameters.
public static int svmCheckProbabilityModel(SVMModel model)
model
- An SVM model.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |