The following are notes on the LIBSVM (https://github.com/cjlin1/libsvm) library.
The entire library is a single header and a single source file.
An SVM model is represented by the struct svm_model. These models can be saved and loaded to disk via svm_save_model and svm_load_model. We can create a model via svm_train:
struct svm_model *svm_train(const struct svm_problem *prob,
const struct svm_parameter *param);svm_train takes in two arguments: a svm_problem struct and a svm_parameter struct. The svm_problem struct represents the training data set.
struct svm_problem {
int l;
double *y;
struct svm_node **x;
};
struct svm_node {
int index;
double value;
}Here l is the size of training data, y is the target values (i.e., classification of each data point) and x is the actual data. An index of -1 is used to indicate the end of a row’s data.
For example, consider the following data set:
| Class | Attr1 | Attr2 | Attr3 | Attr4 | Attr5 |
|---|---|---|---|---|---|
| 1 | 0 | 0.1 | 0.2 | 0 | 0 |
| 2 | 0 | 0.1 | 0.3 | -1.2 | 0 |
| 1 | 0.4 | 0 | 0 | 0 | 0 |
| 2 | 0 | 0.1 | 0 | 1.4 | 0.5 |
| 3 | -0.1 | -0.2 | 0.1 | 1.1 | 0.1 |
Then the svm_problem struct is as follows:
{
"l": 5,
"y": [1,2,1,2,3],
"x": [
(2,0.1), (3,0.2), (-1,?),
(2,0.1), (3,0.3), (4,-1.2), (-1,?),
(1,0.4), (-1,?),
(2,0.1), (4,1.4), (5,0.5), (-1,?),
(1,-0.1), (2,-0.2), (3,0.1), (4,1.1), (5,0.1), (-1,?)
]
}Note that only the non-zero values need to be stored.
Note: The indices must be in ASCENDING order.
Warning: The svm_model contains a pointer to svm_problem. Do not free svm_problem until done with svm_model.
The svm_parameter struct is used to configure the SVM model parameters.
struct svm_parameter {
int svm_type;
int kernel_type;
int degree; /* for poly */
double gamma; /* for poly/rbf/sigmoid */
double coef0; /* for poly/sigmoid */
/* these are for training only */
double cache_size; /* in MB */
double eps; /* stopping criteria */
double C; /* for C_SVC, EPSILON_SVR, and NU_SVR */
int nr_weight; /* for C_SVC */
int *weight_label; /* for C_SVC */
double* weight; /* for C_SVC */
double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */
double p; /* for EPSILON_SVR */
int shrinking; /* use the shrinking heuristics */
int probability; /* do probability estimates */
};The svm_type is one of C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR. To match my code from scikit-learn I should use C_SVC. The authors suggest using an epsilon value of 0.001. One can check that the parameters are setup correctly with svm_check_parameter() which returns NULL if everything is okay, otherwise it returns a const char* describing the error.
const char *svm_check_parameter(const struct svm_problem *prob,
const struct svm_parameter *param);Once an svm_model is constructed, one can call svm_predict to classify a test vector. The class of the vector is returned. svm_predict takes in two arguments: a pointer to the svm_model obtained via training and an array of svm_node representing the test vector.
double svm_predict(const struct svm_model *model, const struct svm_node *x);LIBSVM supports cross validation out of the box. It separates the data into nr_fold folds. Sequentially each fold is validated using the model from training the remaining folds. The predicted labels in the validation process are stored in target.
void svm_cross_validation(const struct svm_problem *prob,
const struct svm_parameter *param, int nr_fold, double *target);A quick test of LIBSVM vs scikit-learn resulted in LIBSVM being approximately 530 times faster:
▶ hyperfine './test' 'python scikit.py' -N
Benchmark 1: ./test
Time (mean ± σ): 1.2 ms ± 0.1 ms [User: 0.9 ms, System: 0.2 ms]
Range (min … max): 1.1 ms … 1.5 ms 2301 runs
Benchmark 2: python scikit.py
Time (mean ± σ): 644.1 ms ± 14.7 ms [User: 1409.4 ms, System: 66.7 ms]
Range (min … max): 622.1 ms … 667.4 ms 10 runs
Summary
./test ran
530.01 ± 25.38 times faster than python scikit.py