The following are notes on the LIBSVM (https://github.com/cjlin1/libsvm) library.
The entire library is a single header and a single source file.
An SVM model is represented by the struct svm_model
. These models can be saved and loaded to disk via svm_save_model
and svm_load_model
. We can create a model via svm_train
:
struct svm_model *svm_train(const struct svm_problem *prob,
const struct svm_parameter *param);
svm_train
takes in two arguments: a svm_problem
struct and a svm_parameter
struct. The svm_problem
struct represents the training data set.
struct svm_problem {
int l;
double *y;
struct svm_node **x;
};
struct svm_node {
int index;
double value;
}
Here l
is the size of training data, y
is the target values (i.e., classification of each data point) and x
is the actual data. An index of -1 is used to indicate the end of a row’s data.
For example, consider the following data set:
Class | Attr1 | Attr2 | Attr3 | Attr4 | Attr5 |
---|---|---|---|---|---|
1 | 0 | 0.1 | 0.2 | 0 | 0 |
2 | 0 | 0.1 | 0.3 | -1.2 | 0 |
1 | 0.4 | 0 | 0 | 0 | 0 |
2 | 0 | 0.1 | 0 | 1.4 | 0.5 |
3 | -0.1 | -0.2 | 0.1 | 1.1 | 0.1 |
Then the svm_problem
struct is as follows:
{
"l": 5,
"y": [1,2,1,2,3],
"x": [
(2,0.1), (3,0.2), (-1,?),
(2,0.1), (3,0.3), (4,-1.2), (-1,?),
(1,0.4), (-1,?),
(2,0.1), (4,1.4), (5,0.5), (-1,?),
(1,-0.1), (2,-0.2), (3,0.1), (4,1.1), (5,0.1), (-1,?)
]
}
Note that only the non-zero values need to be stored.
Note: The indices must be in ASCENDING order.
Warning: The svm_model
contains a pointer to svm_problem
. Do not free svm_problem
until done with svm_model
.
The svm_parameter
struct is used to configure the SVM model parameters.
struct svm_parameter {
int svm_type;
int kernel_type;
int degree; /* for poly */
double gamma; /* for poly/rbf/sigmoid */
double coef0; /* for poly/sigmoid */
/* these are for training only */
double cache_size; /* in MB */
double eps; /* stopping criteria */
double C; /* for C_SVC, EPSILON_SVR, and NU_SVR */
int nr_weight; /* for C_SVC */
int *weight_label; /* for C_SVC */
double* weight; /* for C_SVC */
double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */
double p; /* for EPSILON_SVR */
int shrinking; /* use the shrinking heuristics */
int probability; /* do probability estimates */
};
The svm_type
is one of C_SVC
, NU_SVC
, ONE_CLASS
, EPSILON_SVR
, NU_SVR
. To match my code from scikit-learn I should use C_SVC
. The authors suggest using an epsilon value of 0.001. One can check that the parameters are setup correctly with svm_check_parameter()
which returns NULL if everything is okay, otherwise it returns a const char*
describing the error.
const char *svm_check_parameter(const struct svm_problem *prob,
const struct svm_parameter *param);
Once an svm_model
is constructed, one can call svm_predict
to classify a test vector. The class of the vector is returned. svm_predict
takes in two arguments: a pointer to the svm_model
obtained via training and an array of svm_node
representing the test vector.
double svm_predict(const struct svm_model *model, const struct svm_node *x);
LIBSVM supports cross validation out of the box. It separates the data into nr_fold
folds. Sequentially each fold is validated using the model from training the remaining folds. The predicted labels in the validation process are stored in target
.
void svm_cross_validation(const struct svm_problem *prob,
const struct svm_parameter *param, int nr_fold, double *target);
A quick test of LIBSVM vs scikit-learn resulted in LIBSVM being approximately 530 times faster:
▶ hyperfine './test' 'python scikit.py' -N
Benchmark 1: ./test
Time (mean ± σ): 1.2 ms ± 0.1 ms [User: 0.9 ms, System: 0.2 ms]
Range (min … max): 1.1 ms … 1.5 ms 2301 runs
Benchmark 2: python scikit.py
Time (mean ± σ): 644.1 ms ± 14.7 ms [User: 1409.4 ms, System: 66.7 ms]
Range (min … max): 622.1 ms … 667.4 ms 10 runs
Summary
./test ran
530.01 ± 25.38 times faster than python scikit.py