All Questions (78)
Some courses which have used libsvm as a tool
Some applications/tools which have used libsvm
Where can I find documents/videos of libsvm ?
Where are change log and earlier versions?
How to cite LIBSVM?
I would like to use libsvm in my software. Is there any license problem?
Is there a repository of additional tools based on libsvm?
On unix machines, I got "error in loading shared libraries" or "cannot open shared object file." What happened ?
I have modified the source and would like to build the graphic interface "svm-toy" on MS windows. How should I do it ?
I am an MS windows user but why only one (svm-toy) of those precompiled .exe actually runs ?
What is the difference between "." and "*" outputed during training?
Why occasionally the program (including MATLAB or other interfaces) crashes and gives a segmentation fault?
How to build a dynamic library (.dll file) on MS windows?
On some systems (e.g., Ubuntu), compiling LIBSVM gives many warning messages. Is this a problem and how to disable the warning message?
In LIBSVM, why you don't use certain C/C++ library functions to make the code shorter?
Why sometimes not all attributes of a data appear in the training/model files ?
What if my data are non-numerical ?
Why do you consider sparse format ? Will the training of dense data be much slower ?
Why sometimes the last line of my data is not read by svm-train?
Is there a program to check if my data are in the correct format?
May I put comments in data files?
How to convert other data formats to LIBSVM format?
The output of training C-SVM is like the following. What do they mean?
Can you explain more about the model file?
Should I use float or double to store numbers in the cache ?
How do I choose the kernel?
Does libsvm have special treatments for linear SVM?
The number of free support vectors is large. What should I do?
Should I scale training and testing data in a similar way?
Does it make a big difference if I scale each attribute to [0,1] instead of [-1,1]?
The prediction rate is low. How could I improve it?
My data are unbalanced. Could libsvm handle such problems?
What is the difference between nu-SVC and C-SVC?
The program keeps running (without showing any output). What should I do?
The program keeps running (with output, i.e. many dots). What should I do?
The training time is too long. What should I do?
Does shrinking always help?
How do I get the decision value(s)?
How do I get the distance between a point and the hyperplane?
On 32-bit machines, if I use a large cache (i.e. large -m) on a linux machine, why sometimes I get "segmentation fault ?"
How do I disable screen output of svm-train?
I would like to use my own kernel. Any example? In svm.cpp, there are two subroutines for kernel evaluations: k_function() and kernel_function(). Which one should I modify ?
What method does libsvm use for multi-class SVM ? Why don't you use the "1-against-the rest" method?
How does LIBSVM perform parameter selection for multi-class problems?
After doing cross validation, why there is no model file outputted ?
Why my cross-validation results are different from those in the Practical Guide?
On some systems CV accuracy is the same in several runs. How could I use different data partitions? In other words, how do I set random seed in LIBSVM?
I would like to solve L2-loss SVM (i.e., error term is quadratic). How should I modify the code ?
How do I choose parameters for one-class svm as training data are in only one class?
Why the code gives NaN (not a number) results?
Why on windows sometimes grid.py fails?
Why grid.py/easy.py sometimes generates the following warning message?
Why the sign of predicted labels and decision values are sometimes reversed?
I don't know class labels of test data. What should I put in the first column of the test file?
How can I use OpenMP to parallelize LIBSVM on a multicore/shared-memory computer?
How could I know which training instances are support vectors?
Why training a probability model (i.e., -b 1) takes a longer time?
Why using the -b option does not give me better accuracy?
Why using svm-predict -b 0 and -b 1 gives different accuracy values?
How can I save images drawn by svm-toy?
I press the "load" button to load data points but why svm-toy does not draw them ?
I would like svm-toy to handle more than three classes of data, what should I do ?
What is the difference between Java version and C++ version of libsvm?
Is the Java version significantly slower than the C++ version?
While training I get the following error message: java.lang.OutOfMemoryError. What is wrong?
Why you have the main source file svm.m4 and then transform it to svm.java?
Except the python-C++ interface provided, could I use Jython to call libsvm ?
I compile the MATLAB interface without problem, but why errors occur while running it?
On 64bit Windows I compile the MATLAB interface without problem, but why errors occur while running it?
Does the MATLAB interface provide a function to do scaling?
How could I use MATLAB interface for parameter selection?
I use MATLAB parallel programming toolbox on a multi-core environment for parameter selection. Why the program is even slower?
How do I use LIBSVM with OpenMP under MATLAB?
How could I generate the primal variable w of linear SVM?
Is there an OCTAVE interface for libsvm?
How to handle the name conflict between svmtrain in the libsvm matlab interface and that in MATLAB bioinformatics toolbox?
On Windows I got an error message "Invalid MEX-file: Specific module not found" when running the pre-built MATLAB interface in the windows sub-directory. What should I do?
LIBSVM supports 1-vs-1 multi-class classification. If instead I would like to use 1-vs-rest, how to implement it using MATLAB interface?
Q: Some courses which have used libsvm as a tool
Institute for Computer Science,
Faculty of Applied Science, University of Freiburg, Germany
Division of Mathematics and Computer Science.
Faculteit der Exacte Wetenschappen
Vrije Universiteit, The Netherlands.
Electrical and Computer Engineering Department,
University of Wisconsin-Madison
Technion (Israel Institute of Technology), Israel.
Computer and Information Sciences Dept., University of Florida
The Institute of Computer Science,
University of Nairobi, Kenya.
Applied Mathematics and Computer Science, University of Iceland.
SVM tutorial in machine learning
summer school, University of Chicago, 2005.
[Go Top]
Q: Some applications/tools which have used libsvm
(and maybe liblinear).
[Go Top]
Q: Where can I find documents/videos of libsvm ?
Official implementation document:
C.-C. Chang and
C.-J. Lin.
LIBSVM
: a library for support vector machines.
ACM Transactions on Intelligent
Systems and Technology, 2:27:1--27:27, 2011.
pdf , ps.gz ,
ACM digital lib .
Instructions for using LIBSVM are in the README files in the main directory and some sub-directories.
README in the main directory: details all options, data format, and library calls.
tools/README: parameter selection and other tools
A guide for beginners:
C.-W. Hsu, C.-C. Chang, and
C.-J. Lin.
A practical guide to support vector classification
An introductory video
for windows users.
[Go Top]
Q: Where are change log and earlier versions?
See the change log .
You can download earlier versions
here .
[Go Top]
Q: How to cite LIBSVM?
Please cite the following paper:
Chih-Chung Chang and Chih-Jen Lin, LIBSVM
: a library for support vector machines.
ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011.
Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
The bibtex format is
@article{CC01a,
author = {Chang, Chih-Chung and Lin, Chih-Jen},
title = {{LIBSVM}: A library for support vector machines},
journal = {ACM Transactions on Intelligent Systems and Technology},
volume = {2},
issue = {3},
year = {2011},
pages = {27:1--27:27},
note = {Software available at \url{http://www.csie.ntu.edu.tw/~cjlin/libsvm}}
}
[Go Top]
Q: I would like to use libsvm in my software. Is there any license problem?
The libsvm license ("the modified BSD license")
is compatible with many
free software licenses such as GPL. Hence, it is very easy to
use libsvm in your software.
Please check the COPYRIGHT file in detail. Basically
you need to
Clearly indicate that LIBSVM is used.
Retain the LIBSVM COPYRIGHT file in your software.
It can also be used in commercial products.
[Go Top]
Q: Is there a repository of additional tools based on libsvm?
Yes, see libsvm
tools
[Go Top]
Q: On unix machines, I got "error in loading shared libraries" or "cannot open shared object file." What happened ?
This usually happens if you compile the code
on one machine and run it on another which has incompatible
libraries.
Try to recompile the program on that machine or use static linking.
[Go Top]
Q: I have modified the source and would like to build the graphic interface "svm-toy" on MS windows. How should I do it ?
Build it as a project by choosing "Win32 Project."
On the other hand, for "svm-train" and "svm-predict"
you want to choose "Win32 Console Project."
After libsvm 2.5, you can also use the file Makefile.win.
See details in README.
If you are not using Makefile.win and see the following
link error
LIBCMTD.lib(wwincrt0.obj) : error LNK2001: unresolved external symbol
_wWinMain@16
you may have selected a wrong project type.
[Go Top]
Q: I am an MS windows user but why only one (svm-toy) of those precompiled .exe actually runs ?
You need to open a command window
and type svmtrain.exe to see all options.
Some examples are in README file.
[Go Top]
Q: What is the difference between "." and "*" outputed during training?
"." means every 1,000 iterations (or every #data
iterations is your #data is less than 1,000).
"*" means that after iterations of using
a smaller shrunk problem,
we reset to use the whole set. See the
implementation document for details.
[Go Top]
Q: Why occasionally the program (including MATLAB or other interfaces) crashes and gives a segmentation fault?
Very likely the program consumes too much memory than what the
operating system can provide. Try a smaller data and see if the
program still crashes.
[Go Top]
Q: How to build a dynamic library (.dll file) on MS windows?
The easiest way is to use Makefile.win.
See details in README.
Alternatively, you can use Visual C++. Here is
the example using Visual Studio .Net 2008:
Create a Win32 empty DLL project and set (in Project->$Project_Name
Properties...->Configuration) to "Release."
About how to create a new dynamic link library, please refer to
http://msdn2.microsoft.com/en-us/library/ms235636(VS.80).aspx
Add svm.cpp, svm.h to your project.
Add __WIN32__ and _CRT_SECURE_NO_DEPRECATE to Preprocessor definitions (in
Project->$Project_Name Properties...->C/C++->Preprocessor)
Set Create/Use Precompiled Header to Not Using Precompiled Headers
(in Project->$Project_Name Properties...->C/C++->Precompiled Headers)
Set the path for the Modulation Definition File svm.def (in
Project->$Project_Name Properties...->Linker->input
Build the DLL.
Rename the dll file to libsvm.dll and move it to the correct path.
[Go Top]
Q: On some systems (e.g., Ubuntu), compiling LIBSVM gives many warning messages. Is this a problem and how to disable the warning message?
The warning message is like
svm.cpp:2730: warning: ignoring return value of int fscanf(FILE*, const char*, ...), declared with attribute warn_unused_result
This is not a problem; see this page for more
details of ubuntu systems.
In the future we may modify the code
so that these messages do not appear.
At this moment, to disable the warning message you can replace
CFLAGS = -Wall -Wconversion -O3 -fPIC
with
CFLAGS = -Wall -Wconversion -O3 -fPIC -U_FORTIFY_SOURCE
in Makefile.
[Go Top]
Q: In LIBSVM, why you don't use certain C/C++ library functions to make the code shorter?
For portability, we use only features defined in ISO C89. Note that features in ISO C99 may not be available everywhere.
Even the newest gcc lacks some features in C99 (see http://gcc.gnu.org/c99status.html for details).
If the situation changes in the future,
we might consider using these newer features.
[Go Top]
Q: Why sometimes not all attributes of a data appear in the training/model files ?
libsvm uses the so called "sparse" format where zero
values do not need to be stored. Hence a data with attributes
1 0 2 0
is represented as
1:1 3:2
[Go Top]
Q: What if my data are non-numerical ?
Currently libsvm supports only numerical data.
You may have to change non-numerical data to
numerical. For example, you can use several
binary attributes to represent a categorical
attribute.
[Go Top]
Q: Why do you consider sparse format ? Will the training of dense data be much slower ?
This is a controversial issue. The kernel
evaluation (i.e. inner product) of sparse vectors is slower
so the total training time can be at least twice or three times
of that using the dense format.
However, we cannot support only dense format as then we CANNOT
handle extremely sparse cases. Simplicity of the code is another
concern. Right now we decide to support
the sparse format only.
[Go Top]
Q: Why sometimes the last line of my data is not read by svm-train?
We assume that you have '\n' in the end of
each line. So please press enter in the end
of your last line.
[Go Top]
Q: Is there a program to check if my data are in the correct format?
The svm-train program in libsvm conducts only a simple check of the input data. To do a
detailed check, after libsvm 2.85, you can use the python script tools/checkdata.py. See tools/README for details.
[Go Top]
Q: May I put comments in data files?
We don't officially support this. But, cureently LIBSVM
is able to process data in the following
format:
1 1:2 2:1 # your comments
Note that the character ":" should not appear in your
comments.
[Go Top]
Q: How to convert other data formats to LIBSVM format?
It depends on your data format. A simple way is to use
libsvmwrite in the libsvm matlab/octave interface.
Take a CSV (comma-separated values) file
in UCI machine learning repository as an example.
We download SPECTF.train .
Labels are in the first column. The following steps produce
a file in the libsvm format.
matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
matlab> labels = SPECTF(:, 1); % labels from the 1st column
matlab> features = SPECTF(:, 2:end);
matlab> features_sparse = sparse(features); % features must be in a sparse matrix
matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);
The tranformed data are stored in SPECTFlibsvm.train.
Alternatively, you can use convert.c
to convert CSV format to libsvm format.
[Go Top]
Q: The output of training C-SVM is like the following. What do they mean?
optimization finished, #iter = 219
nu = 0.431030
obj = -100.877286, rho = 0.424632
nSV = 132, nBSV = 107
Total nSV = 132
obj is the optimal objective value of the dual SVM problem.
rho is the bias term in the decision function
sgn(w^Tx - rho).
nSV and nBSV are number of support vectors and bounded support
vectors (i.e., alpha_i = C). nu-svm is a somewhat equivalent
form of C-SVM where C is replaced by nu. nu simply shows the
corresponding parameter. More details are in
libsvm document .
[Go Top]
Q: Can you explain more about the model file?
In the model file, after parameters and other informations such as labels , each line represents a support vector.
Support vectors are listed in the order of "labels" shown earlier.
(i.e., those from the first class in the "labels" list are
grouped first, and so on.)
If k is the total number of classes,
in front of a support vector in class j, there are
k-1 coefficients
y*alpha where alpha are dual solution of the
following two class problems:
1 vs j, 2 vs j, ..., j-1 vs j, j vs j+1, j vs j+2, ..., j vs k
and y=1 in first j-1 coefficients, y=-1 in the remaining
k-j coefficients.
For example, if there are 4 classes, the file looks like:
+-+-+-+--------------------+
|1|1|1| |
|v|v|v| SVs from class 1 |
|2|3|4| |
+-+-+-+--------------------+
|1|2|2| |
|v|v|v| SVs from class 2 |
|2|3|4| |
+-+-+-+--------------------+
|1|2|3| |
|v|v|v| SVs from class 3 |
|3|3|4| |
+-+-+-+--------------------+
|1|2|3| |
|v|v|v| SVs from class 4 |
|4|4|4| |
+-+-+-+--------------------+
See also
an illustration using
MATLAB/OCTAVE.
[Go Top]
Q: Should I use float or double to store numbers in the cache ?
We have float as the default as you can store more numbers
in the cache.
In general this is good enough but for few difficult
cases (e.g. C very very large) where solutions are huge
numbers, it might be possible that the numerical precision is not
enough using only float.
[Go Top]
Q: How do I choose the kernel?
In general we suggest you to try the RBF kernel first.
A recent result by Keerthi and Lin
(
download paper here )
shows that if RBF is used with model selection,
then there is no need to consider the linear kernel.
The kernel matrix using sigmoid may not be positive definite
and in general it's accuracy is not better than RBF.
(see the paper by Lin and Lin
(
download paper here ).
Polynomial kernels are ok but if a high degree is used,
numerical difficulties tend to happen
(thinking about dth power of (<1) goes to 0
and (>1) goes to infinity).
[Go Top]
Q: Does libsvm have special treatments for linear SVM?
No, libsvm solves linear/nonlinear SVMs by the
same way.
Some tricks may save training/testing time if the
linear kernel is used,
so libsvm is NOT particularly efficient for linear SVM,
especially when
C is large and
the number of data is much larger
than the number of attributes.
You can either
Please also see our SVM guide
on the discussion of using RBF and linear
kernels.
[Go Top]
Q: The number of free support vectors is large. What should I do?
This usually happens when the data are overfitted.
If attributes of your data are in large ranges,
try to scale them. Then the region
of appropriate parameters may be larger.
Note that there is a scale program
in libsvm.
[Go Top]
Q: Should I scale training and testing data in a similar way?
Yes, you can do the following:
> svm-scale -s scaling_parameters train_data > scaled_train_data
> svm-scale -r scaling_parameters test_data > scaled_test_data
[Go Top]
Q: Does it make a big difference if I scale each attribute to [0,1] instead of [-1,1]?
For the linear scaling method, if the RBF kernel is
used and parameter selection is conducted, there
is no difference. Assume Mi and mi are
respectively the maximal and minimal values of the
ith attribute. Scaling to [0,1] means
x'=(x-mi)/(Mi-mi)
For [-1,1],
x''=2(x-mi)/(Mi-mi)-1.
In the RBF kernel,
x'-y'=(x-y)/(Mi-mi), x''-y''=2(x-y)/(Mi-mi).
Hence, using (C,g) on the [0,1]-scaled data is the
same as (C,g/2) on the [-1,1]-scaled data.
Though the performance is the same, the computational
time may be different. For data with many zero entries,
[0,1]-scaling keeps the sparsity of input data and hence
may save the time.
[Go Top]
Q: The prediction rate is low. How could I improve it?
Try to use the model selection tool grid.py in the python
directory find
out good parameters. To see the importance of model selection,
please
see my talk:
A practical guide to support vector
classification
[Go Top]
Q: My data are unbalanced. Could libsvm handle such problems?
Yes, there is a -wi options. For example, if you use
> svm-train -s 0 -c 10 -w1 1 -w-1 5 data_file
the penalty for class "-1" is larger.
Note that this -w option is for C-SVC only.
[Go Top]
Q: What is the difference between nu-SVC and C-SVC?
Basically they are the same thing but with different
parameters. The range of C is from zero to infinity
but nu is always between [0,1]. A nice property
of nu is that it is related to the ratio of
support vectors and the ratio of the training
error.
[Go Top]
Q: The program keeps running (without showing any output). What should I do?
You may want to check your data. Each training/testing
data must be in one line. It cannot be separated.
In addition, you have to remove empty lines.
[Go Top]
Q: The program keeps running (with output, i.e. many dots). What should I do?
In theory libsvm guarantees to converge.
Therefore, this means you are
handling ill-conditioned situations
(e.g. too large/small parameters) so numerical
difficulties occur.
You may get better numerical stability by replacing
typedef float Qfloat;
in svm.cpp with
typedef double Qfloat;
That is, elements in the kernel cache are stored
in double instead of single. However, this means fewer elements
can be put in the kernel cache.
[Go Top]
Q: The training time is too long. What should I do?
For large problems, please specify enough cache size (i.e.,
-m).
Slow convergence may happen for some difficult cases (e.g. -c is large).
You can try to use a looser stopping tolerance with -e.
If that still doesn't work, you may train only a subset of the data.
You can use the program subset.py in the directory "tools"
to obtain a random subset.
If you have extremely large data and face this difficulty, please
contact us. We will be happy to discuss possible solutions.
When using large -e, you may want to check if -h 0 (no shrinking) or -h 1 (shrinking) is faster.
See a related question below.
[Go Top]
Q: Does shrinking always help?
If the number of iterations is high, then shrinking
often helps.
However, if the number of iterations is small
(e.g., you specify a large -e), then
probably using -h 0 (no shrinking) is better.
See the
implementation document for details.
[Go Top]
Q: How do I get the decision value(s)?
We print out decision values for regression. For classification,
we solve several binary SVMs for multi-class cases. You
can obtain values by easily calling the subroutine
svm_predict_values. Their corresponding labels
can be obtained from svm_get_labels.
Details are in
README of libsvm package.
If you are using MATLAB/OCTAVE interface, svmpredict can directly
give you decision values. Please see matlab/README for details.
We do not recommend the following. But if you would
like to get values for
TWO-class classification with labels +1 and -1
(note: +1 and -1 but not things like 5 and 10)
in the easiest way, simply add
printf("%f\n", dec_values[0]*model->label[0]);
after the line
svm_predict_values(model, x, dec_values);
of the file svm.cpp.
Positive (negative)
decision values correspond to data predicted as +1 (-1).
[Go Top]
Q: How do I get the distance between a point and the hyperplane?
The distance is |decision_value| / |w|.
We have |w|^2 = w^Tw = alpha^T Q alpha = 2*(dual_obj + sum alpha_i).
Thus in svm.cpp please find the place
where we calculate the dual objective value
(i.e., the subroutine Solve())
and add a statement to print w^Tw.
[Go Top]
Q: On 32-bit machines, if I use a large cache (i.e. large -m) on a linux machine, why sometimes I get "segmentation fault ?"
On 32-bit machines, the maximum addressable
memory is 4GB. The Linux kernel uses 3:1
split which means user space is 3G and
kernel space is 1G. Although there are
3G user space, the maximum dynamic allocation
memory is 2G. So, if you specify -m near 2G,
the memory will be exhausted. And svm-train
will fail when it asks more memory.
For more details, please read
this article .
The easiest solution is to switch to a
64-bit machine.
Otherwise, there are two ways to solve this. If your
machine supports Intel's PAE (Physical Address
Extension), you can turn on the option HIGHMEM64G
in Linux kernel which uses 4G:4G split for
kernel and user space. If you don't, you can
try a software `tub' which can eliminate the 2G
boundary for dynamic allocated memory. The `tub'
is available at
http://www.bitwagon.com/tub.html .
[Go Top]
Q: How do I disable screen output of svm-train?
For commend-line users, use the -q option:
> ./svm-train -q heart_scale
For library users, set the global variable
extern void (*svm_print_string) (const char *);
to specify the output format. You can disable the output by the following steps:
Declare a function to output nothing:
void print_null(const char *s) {}
Assign the output function of libsvm by
svm_print_string = &print_null;
Finally, a way used in earlier libsvm
is by updating svm.cpp from
#if 1
void info(const char *fmt,...)
to
#if 0
void info(const char *fmt,...)
[Go Top]
Q: I would like to use my own kernel. Any example? In svm.cpp, there are two subroutines for kernel evaluations: k_function() and kernel_function(). Which one should I modify ?
An example is "LIBSVM for string data" in LIBSVM Tools.
The reason why we have two functions is as follows.
For the RBF kernel exp(-g |xi - xj|^2), if we calculate
xi - xj first and then the norm square, there are 3n operations.
Thus we consider exp(-g (|xi|^2 - 2dot(xi,xj) +|xj|^2))
and by calculating all |xi|^2 in the beginning,
the number of operations is reduced to 2n.
This is for the training. For prediction we cannot
do this so a regular subroutine using that 3n operations is
needed.
The easiest way to have your own kernel is
to put the same code in these two
subroutines by replacing any kernel.
[Go Top]
Q: What method does libsvm use for multi-class SVM ? Why don't you use the "1-against-the rest" method?
It is one-against-one. We chose it after doing the following
comparison:
C.-W. Hsu and C.-J. Lin.
A comparison of methods
for multi-class support vector machines
,
IEEE Transactions on Neural Networks , 13(2002), 415-425.
"1-against-the rest" is a good method whose performance
is comparable to "1-against-1." We do the latter
simply because its training time is shorter.
[Go Top]
Q: How does LIBSVM perform parameter selection for multi-class problems?
LIBSVM implements "one-against-one" multi-class method, so there are
k(k-1)/2 binary models, where k is the number of classes.
We can consider two ways to conduct parameter selection.
For any two classes of data, a parameter selection procedure is conducted. Finally,
each decision function has its own optimal parameters.
The same parameters are used for all k(k-1)/2 binary classification problems.
We select parameters that achieve the highest overall performance.
Each has its own advantages. A
single parameter set may not be uniformly good for all k(k-1)/2 decision functions.
However, as the overall accuracy is the final consideration, one parameter set
for one decision function may lead to over-fitting. In the paper
Chen, Lin, and Schölkopf,
A tutorial on nu-support vector machines.
Applied Stochastic Models in Business and Industry, 21(2005), 111-136,
they have experimentally
shown that the two methods give similar performance.
Therefore, currently the parameter selection in LIBSVM
takes the second approach by considering the same parameters for
all k(k-1)/2 models.
[Go Top]
Q: After doing cross validation, why there is no model file outputted ?
Cross validation is used for selecting good parameters.
After finding them, you want to re-train the whole
data without the -v option.
[Go Top]
Q: Why my cross-validation results are different from those in the Practical Guide?
Due to random partitions of
the data, on different systems CV accuracy values
may be different.
[Go Top]
Q: On some systems CV accuracy is the same in several runs. How could I use different data partitions? In other words, how do I set random seed in LIBSVM?
If you use GNU C library,
the default seed 1 is considered. Thus you always
get the same result of running svm-train -v.
To have different seeds, you can add the following code
in svm-train.c:
#include <time.h>
and in the beginning of main(),
srand(time(0));
Alternatively, if you are not using GNU C library
and would like to use a fixed seed, you can have
srand(1);
For Java, the random number generator
is initialized using the time information.
So results of two CV runs are different.
To fix the seed, after version 3.1 (released
in mid 2011), you can add
svm.rand.setSeed(0);
in the main() function of svm_train.java.
If you use CV to select parameters, it is recommended to use identical folds
under different parameters. In this case, you can consider fixing the seed.
[Go Top]
Q: I would like to solve L2-loss SVM (i.e., error term is quadratic). How should I modify the code ?
It is extremely easy. Taking c-svc for example, to solve
min_w w^Tw/2 + C \sum max(0, 1- (y_i w^Tx_i+b))^2,
only two
places of svm.cpp have to be changed.
First, modify the following line of
solve_c_svc from
s.Solve(l, SVC_Q(*prob,*param,y), minus_ones, y,
alpha, Cp, Cn, param->eps, si, param->shrinking);
to
s.Solve(l, SVC_Q(*prob,*param,y), minus_ones, y,
alpha, INF, INF, param->eps, si, param->shrinking);
Second, in the class of SVC_Q, declare C as
a private variable:
double C;
In the constructor replace
for(int i=0;i<prob.l;i++)
QD[i]= (Qfloat)(this->*kernel_function)(i,i);
with
this->C = param.C;
for(int i=0;i<prob.l;i++)
QD[i]= (Qfloat)(this->*kernel_function)(i,i)+0.5/C;
Then in the subroutine get_Q, after the for loop, add
if(i >= start && i < len)
data[i] += 0.5/C;
For one-class svm, the modification is exactly the same. For SVR, you don't need an if statement like the above. Instead, you only need a simple assignment:
data[real_i] += 0.5/C;
For large linear L2-loss SVM, please use
LIBLINEAR .
[Go Top]
Q: How do I choose parameters for one-class svm as training data are in only one class?
You have pre-specified true positive rate in mind and then search for
parameters which achieve similar cross-validation accuracy.
[Go Top]
Q: Why the code gives NaN (not a number) results?
This rarely happens, but few users reported the problem.
It seems that their
computers for training libsvm have the VPN client
running. The VPN software has some bugs and causes this
problem. Please try to close or disconnect the VPN client.
[Go Top]
Q: Why on windows sometimes grid.py fails?
This problem shouldn't happen after version
2.85. If you are using earlier versions,
please download the latest one.
[Go Top]
Q: Why grid.py/easy.py sometimes generates the following warning message?
Warning: empty z range [62.5:62.5], adjusting to [61.875:63.125]
Notice: cannot contour non grid data!
Nothing is wrong and please disregard the
message. It is from gnuplot when drawing
the contour.
[Go Top]
Q: Why the sign of predicted labels and decision values are sometimes reversed?
Nothing is wrong. Very likely you have two labels +1/-1 and the first instance in your data
has -1.
Think about the case of labels +5/+10. Since
SVM needs to use +1/-1, internally
we map +5/+10 to +1/-1 according to which
label appears first.
Hence a positive decision value implies
that we should predict the "internal" +1,
which may not be the +1 in the input file.
[Go Top]
Q: I don't know class labels of test data. What should I put in the first column of the test file?
Any value is ok. In this situation, what you will use is the output file of svm-predict, which gives predicted class labels.
[Go Top]
Q: How can I use OpenMP to parallelize LIBSVM on a multicore/shared-memory computer?
It is very easy if you are using GCC 4.2
or after.
In Makefile, add -fopenmp to CFLAGS.
In class SVC_Q of svm.cpp, modify the for loop
of get_Q to:
#pragma omp parallel for private(j)
for(j=start;j<len;j++)
In the subroutine svm_predict_values of svm.cpp, add one line to the for loop:
#pragma omp parallel for private(i)
for(i=0;i<l;i++)
kvalue[i] = Kernel::k_function(x,model->SV[i],model->param);
For regression, you need to modify
class SVR_Q instead. The loop in svm_predict_values
is also different because you need
a reduction clause for the variable sum:
#pragma omp parallel for private(i) reduction(+:sum)
for(i=0;i<model->l;i++)
sum += sv_coef[i] * Kernel::k_function(x,model->SV[i],model->param);
Then rebuild the package. Kernel evaluations in training/testing will be parallelized. An example of running this modification on
an 8-core machine using the data set
ijcnn1 :
8 cores:
%setenv OMP_NUM_THREADS 8
%time svm-train -c 16 -g 4 -m 400 ijcnn1
27.1sec
1 core:
%setenv OMP_NUM_THREADS 1
%time svm-train -c 16 -g 4 -m 400 ijcnn1
79.8sec
For this data, kernel evaluations take 80% of training time. In the above example, we assume you use csh. For bash, use
export OMP_NUM_THREADS=8
instead.
For Python interface, you need to add the -lgomp link option:
$(CXX) -lgomp -shared -dynamiclib svm.o -o libsvm.so.$(SHVER)
For MS Windows, you need to add /openmp in CFLAGS of Makefile.win
[Go Top]
Q: How could I know which training instances are support vectors?
It's very simple. Since version 3.13, you can use the function
void svm_get_sv_indices(const struct svm_model *model, int *sv_indices)
to get indices of support vectors. For example, in svm-train.c, after
model = svm_train(&prob, ¶m);
you can add
int nr_sv = svm_get_nr_sv(model);
int *sv_indices = Malloc(int, nr_sv);
svm_get_sv_indices(model, sv_indices);
for (int i=0; i<nr_sv; i++)
printf("instance %d is a support vector\n", sv_indices[i]);
If you use matlab interface, you can directly check
model.sv_indices
[Go Top]
Q: Why training a probability model (i.e., -b 1) takes a longer time?
To construct this probability model, we internally conduct a
cross validation, which is more time consuming than
a regular training.
Hence, in general you do parameter selection first without
-b 1. You only use -b 1 when good parameters have been
selected. In other words, you avoid using -b 1 and -v
together.
[Go Top]
Q: Why using the -b option does not give me better accuracy?
There is absolutely no reason the probability outputs guarantee
you better accuracy. The main purpose of this option is
to provide you the probability estimates, but not to boost
prediction accuracy. From our experience,
after proper parameter selections, in general with
and without -b have similar accuracy. Occasionally there
are some differences.
It is not recommended to compare the two under
just a fixed parameter
set as more differences will be observed.
[Go Top]
Q: Why using svm-predict -b 0 and -b 1 gives different accuracy values?
Let's just consider two-class classification here. After probability information is obtained in training,
we do not have
prob > = 0.5 if and only if decision value >= 0.
So predictions may be different with -b 0 and 1.
[Go Top]
Q: How can I save images drawn by svm-toy?
For Microsoft windows, first press the "print screen" key on the keyboard.
Open "Microsoft Paint"
(included in Windows)
and press "ctrl-v." Then you can clip
the part of picture which you want.
For X windows, you can
use the program "xv" or "import" to grab the picture of the svm-toy window.
[Go Top]
Q: I press the "load" button to load data points but why svm-toy does not draw them ?
The program svm-toy assumes both attributes (i.e. x-axis and y-axis
values) are in (0,1). Hence you want to scale your
data to between a small positive number and
a number less than but very close to 1.
Moreover, class labels must be 1, 2, or 3
(not 1.0, 2.0 or anything else).
[Go Top]
Q: I would like svm-toy to handle more than three classes of data, what should I do ?
Taking windows/svm-toy.cpp as an example, you need to
modify it and the difference
from the original file is as the following: (for five classes of
data)
30,32c30
< RGB(200,0,200),
< RGB(0,160,0),
< RGB(160,0,0)
---
> RGB(200,0,200)
39c37
< HBRUSH brush1, brush2, brush3, brush4, brush5;
---
> HBRUSH brush1, brush2, brush3;
113,114d110
< brush4 = CreateSolidBrush(colors[7]);
< brush5 = CreateSolidBrush(colors[8]);
155,157c151
< else if(v==3) return brush3;
< else if(v==4) return brush4;
< else return brush5;
---
> else return brush3;
325d318
< int colornum = 5;
327c320
< svm_node *x_space = new svm_node[colornum * prob.l];
---
> svm_node *x_space = new svm_node[3 * prob.l];
333,338c326,331
< x_space[colornum * i].index = 1;
< x_space[colornum * i].value = q->x;
< x_space[colornum * i + 1].index = 2;
< x_space[colornum * i + 1].value = q->y;
< x_space[colornum * i + 2].index = -1;
< prob.x[i] = &x_space[colornum * i];
---
> x_space[3 * i].index = 1;
> x_space[3 * i].value = q->x;
> x_space[3 * i + 1].index = 2;
> x_space[3 * i + 1].value = q->y;
> x_space[3 * i + 2].index = -1;
> prob.x[i] = &x_space[3 * i];
397c390
< if(current_value > 5) current_value = 1;
---
> if(current_value > 3) current_value = 1;
[Go Top]
Q: What is the difference between Java version and C++ version of libsvm?
They are the same thing. We just rewrote the C++ code
in Java.
[Go Top]
Q: Is the Java version significantly slower than the C++ version?
This depends on the VM you used. We have seen good
VM which leads the Java version to be quite competitive with
the C++ code. (though still slower)
[Go Top]
Q: While training I get the following error message: java.lang.OutOfMemoryError. What is wrong?
You should try to increase the maximum Java heap size.
For example,
java -Xmx2048m -classpath libsvm.jar svm_train ...
sets the maximum heap size to 2048M.
[Go Top]
Q: Why you have the main source file svm.m4 and then transform it to svm.java?
Unlike C, Java does not have a preprocessor built-in.
However, we need some macros (see first 3 lines of svm.m4).
[Go Top]
Q: Except the python-C++ interface provided, could I use Jython to call libsvm ?
Yes, here are some examples:
$ export CLASSPATH=$CLASSPATH:~/libsvm-2.91/java/libsvm.jar
$ ./jython
Jython 2.1a3 on java1.3.0 (JIT: jitc)
Type "copyright", "credits" or "license" for more information.
>>> from libsvm import *
>>> dir()
['__doc__', '__name__', 'svm', 'svm_model', 'svm_node', 'svm_parameter',
'svm_problem']
>>> x1 = [svm_node(index=1,value=1)]
>>> x2 = [svm_node(index=1,value=-1)]
>>> param = svm_parameter(svm_type=0,kernel_type=2,gamma=1,cache_size=40,eps=0.001,C=1,nr_weight=0,shrinking=1)
>>> prob = svm_problem(l=2,y=[1,-1],x=[x1,x2])
>>> model = svm.svm_train(prob,param)
*
optimization finished, #iter = 1
nu = 1.0
obj = -1.018315639346838, rho = 0.0
nSV = 2, nBSV = 2
Total nSV = 2
>>> svm.svm_predict(model,x1)
1.0
>>> svm.svm_predict(model,x2)
-1.0
>>> svm.svm_save_model("test.model",model)
[Go Top]
Q: I compile the MATLAB interface without problem, but why errors occur while running it?
Your compiler version may not be supported/compatible for MATLAB.
Please check this MATLAB page first and then specify the version
number. For example, if g++ X.Y is supported, replace
CXX = g++
in the Makefile with
CXX = g++-X.Y
[Go Top]
Q: On 64bit Windows I compile the MATLAB interface without problem, but why errors occur while running it?
Please make sure that you use
the -largeArrayDims option in make.m. For example,
mex -largeArrayDims -O -c svm.cpp
Moreover, if you use Microsoft Visual Studio,
probabally it is not properly installed.
See the explanation
here .
[Go Top]
Q: Does the MATLAB interface provide a function to do scaling?
It is extremely easy to do scaling under MATLAB.
The following one-line code scale each feature to the range
of [0,1]:
(data - repmat(min(data,[],1),size(data,1),1))*spdiags(1./(max(data,[],1)-min(data,[],1))',0,size(data,2),size(data,2))
[Go Top]
Q: How could I use MATLAB interface for parameter selection?
One can do this by a simple loop.
See the following example:
bestcv = 0;
for log2c = -1:3,
for log2g = -4:1,
cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
cv = svmtrain(heart_scale_label, heart_scale_inst, cmd);
if (cv >= bestcv),
bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
end
fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);
end
end
You may adjust the parameter range in the above loops.
[Go Top]
Q: I use MATLAB parallel programming toolbox on a multi-core environment for parameter selection. Why the program is even slower?
Fabrizio Lacalandra of University of Pisa reported this issue.
It seems the problem is caused by the screen output.
If you disable the info function
using
#if 0, then the problem
may be solved.
[Go Top]
Q: How do I use LIBSVM with OpenMP under MATLAB?
In Makefile,
you need to add -fopenmp to CFLAGS and -lgomp to MEX_OPTION. For Octave, you need the same modification.
However, a minor problem is that
the number of threads cannot
be specified in MATLAB. We tried Version 7.12 (R2011a) and gcc-4.6.1.
% export OMP_NUM_THREADS=4; matlab
>> setenv('OMP_NUM_THREADS', '1');
Then OMP_NUM_THREADS is still 4 while running the program. Please contact us if you
see how to solve this problem. You can, however,
specify the number in the source code (thanks
to comments from Ricardo Santiago-mozos):
#pragma omp parallel for private(i) num_threads(4)
[Go Top]
Q: How could I generate the primal variable w of linear SVM?
Let's start from the binary class and
assume you have two labels -1 and +1.
After obtaining the model from calling svmtrain,
do the following to have w and b:
w = model.SVs' * model.sv_coef;
b = -model.rho;
if model.Label(1) == -1
w = -w;
b = -b;
end
If you do regression or one-class SVM, then the if statement is not needed.
For multi-class SVM, we illustrate the setting
in the following example of running the iris
data, which have 3 classes
> [y, x] = libsvmread('../../htdocs/libsvmtools/datasets/multiclass/iris.scale');
> m = svmtrain(y, x, '-t 0')
m =
Parameters: [5x1 double]
nr_class: 3
totalSV: 42
rho: [3x1 double]
Label: [3x1 double]
ProbA: []
ProbB: []
nSV: [3x1 double]
sv_coef: [42x2 double]
SVs: [42x4 double]
sv_coef is like:
+-+-+--------------------+
|1|1| |
|v|v| SVs from class 1 |
|2|3| |
+-+-+--------------------+
|1|2| |
|v|v| SVs from class 2 |
|2|3| |
+-+-+--------------------+
|1|2| |
|v|v| SVs from class 3 |
|3|3| |
+-+-+--------------------+
so we need to see nSV of each classes.
> m.nSV
ans =
3
21
18
Suppose the goal is to find the vector w of classes
1 vs 3. Then
y_i alpha_i of training 1 vs 3 are
> coef = [m.sv_coef(1:3,2); m.sv_coef(25:42,1)];
and SVs are:
> SVs = [m.SVs(1:3,:); m.SVs(25:42,:)];
Hence, w is
> w = SVs'*coef;
For rho,
> m.rho
ans =
1.1465
0.3682
-1.9969
> b = -m.rho(2);
because rho is arranged by 1vs2 1vs3 2vs3.
[Go Top]
Q: Is there an OCTAVE interface for libsvm?
Yes, after libsvm 2.86, the matlab interface
works on OCTAVE as well. Please use make.m by typing
>> make
under OCTAVE.
[Go Top]
Q: How to handle the name conflict between svmtrain in the libsvm matlab interface and that in MATLAB bioinformatics toolbox?
The easiest way is to rename the svmtrain binary
file (e.g., svmtrain.mexw32 on 32-bit windows)
to a different
name (e.g., svmtrain2.mexw32).
[Go Top]
Q: On Windows I got an error message "Invalid MEX-file: Specific module not found" when running the pre-built MATLAB interface in the windows sub-directory. What should I do?
The error usually happens
when there are missing runtime components
such as MSVCR100.dll on your Windows platform.
You can use tools such as
Dependency
Walker to find missing library files.
For example, if the pre-built MEX files are compiled by
Visual C++ 2010,
you must have installed
Microsoft Visual C++ Redistributable Package 2010
(vcredist_x86.exe). You can easily find the freely
available file from Microsoft's web site.
For 64bit Windows, the situation is similar. If
the pre-built files are by
Visual C++ 2008, then you must have
Microsoft Visual C++ Redistributable Package 2008
(vcredist_x64.exe).
[Go Top]
Q: LIBSVM supports 1-vs-1 multi-class classification. If instead I would like to use 1-vs-rest, how to implement it using MATLAB interface?
Please use code in the following directory . The following example shows how to
train and test the problem dna (training and testing ).
Load, train and predict data:
[trainY trainX] = libsvmread('./dna.scale');
[testY testX] = libsvmread('./dna.scale.t');
model = ovrtrain(trainY, trainX, '-c 8 -g 4');
[pred ac decv] = ovrpredict(testY, testX, model);
fprintf('Accuracy = %g%%\n', ac * 100);
Conduct CV on a grid of parameters
bestcv = 0;
for log2c = -1:2:3,
for log2g = -4:2:1,
cmd = ['-q -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
cv = get_cv_ac(trainY, trainX, cmd, 3);
if (cv >= bestcv),
bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
end
fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);
end
end
[Go Top]
LIBSVM home page
================================================
FILE: binaries/windows/x86/README
================================================
Libsvm is a simple, easy-to-use, and efficient software for SVM
classification and regression. It solves C-SVM classification, nu-SVM
classification, one-class-SVM, epsilon-SVM regression, and nu-SVM
regression. It also provides an automatic model selection tool for
C-SVM classification. This document explains the use of libsvm.
Libsvm is available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm
Please read the COPYRIGHT file before using libsvm.
Table of Contents
=================
- Quick Start
- Installation and Data Format
- `svm-train' Usage
- `svm-predict' Usage
- `svm-scale' Usage
- Tips on Practical Use
- Examples
- Precomputed Kernels
- Library Usage
- Java Version
- Building Windows Binaries
- Additional Tools: Sub-sampling, Parameter Selection, Format checking, etc.
- MATLAB/OCTAVE Interface
- Python Interface
- Additional Information
Quick Start
===========
If you are new to SVM and if the data is not large, please go to
`tools' directory and use easy.py after installation. It does
everything automatic -- from data scaling to parameter selection.
Usage: easy.py training_file [testing_file]
More information about parameter selection can be found in
`tools/README.'
Installation and Data Format
============================
On Unix systems, type `make' to build the `svm-train' and `svm-predict'
programs. Run them without arguments to show the usages of them.
On other systems, consult `Makefile' to build them (e.g., see
'Building Windows binaries' in this file) or use the pre-built
binaries (Windows binaries are in the directory `windows').
The format of training and testing data file is:
: : ...
.
.
.
Each line contains an instance and is ended by a '\n' character. For
classification, is an integer indicating the class label
(multi-class is supported). For regression, is the target
value which can be any real number. For one-class SVM, it's not used
so can be any number. The pair : gives a feature
(attribute) value: is an integer starting from 1 and
is a real number. The only exception is the precomputed kernel, where
starts from 0; see the section of precomputed kernels. Indices
must be in ASCENDING order. Labels in the testing file are only used
to calculate accuracy or errors. If they are unknown, just fill the
first column with any numbers.
A sample classification data included in this package is
`heart_scale'. To check if your data is in a correct form, use
`tools/checkdata.py' (details in `tools/README').
Type `svm-train heart_scale', and the program will read the training
data and output the model file `heart_scale.model'. If you have a test
set called heart_scale.t, then type `svm-predict heart_scale.t
heart_scale.model output' to see the prediction accuracy. The `output'
file contains the predicted class labels.
For classification, if training data are in only one class (i.e., all
labels are the same), then `svm-train' issues a warning message:
`Warning: training data in only one class. See README for details,'
which means the training data is very unbalanced. The label in the
training data is directly returned when testing.
There are some other useful programs in this package.
svm-scale:
This is a tool for scaling input data file.
svm-toy:
This is a simple graphical interface which shows how SVM
separate data in a plane. You can click in the window to
draw data points. Use "change" button to choose class
1, 2 or 3 (i.e., up to three classes are supported), "load"
button to load data from a file, "save" button to save data to
a file, "run" button to obtain an SVM model, and "clear"
button to clear the window.
You can enter options in the bottom of the window, the syntax of
options is the same as `svm-train'.
Note that "load" and "save" consider dense data format both in
classification and the regression cases. For classification,
each data point has one label (the color) that must be 1, 2,
or 3 and two attributes (x-axis and y-axis values) in
[0,1). For regression, each data point has one target value
(y-axis) and one attribute (x-axis values) in [0, 1).
Type `make' in respective directories to build them.
You need Qt library to build the Qt version.
(available from http://www.trolltech.com)
You need GTK+ library to build the GTK version.
(available from http://www.gtk.org)
The pre-built Windows binaries are in the `windows'
directory. We use Visual C++ on a 32-bit machine, so the
maximal cache size is 2GB.
`svm-train' Usage
=================
Usage: svm-train [options] training_set_file [model_file]
options:
-s svm_type : set type of SVM (default 0)
0 -- C-SVC (multi-class classification)
1 -- nu-SVC (multi-class classification)
2 -- one-class SVM
3 -- epsilon-SVR (regression)
4 -- nu-SVR (regression)
-t kernel_type : set type of kernel function (default 2)
0 -- linear: u'*v
1 -- polynomial: (gamma*u'*v + coef0)^degree
2 -- radial basis function: exp(-gamma*|u-v|^2)
3 -- sigmoid: tanh(gamma*u'*v + coef0)
4 -- precomputed kernel (kernel values in training_set_file)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n: n-fold cross validation mode
-q : quiet mode (no outputs)
The k in the -g option means the number of attributes in the input data.
option -v randomly splits the data into n parts and calculates cross
validation accuracy/mean squared error on them.
See libsvm FAQ for the meaning of outputs.
`svm-predict' Usage
===================
Usage: svm-predict [options] test_file model_file output_file
options:
-b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported
model_file is the model file generated by svm-train.
test_file is the test data you want to predict.
svm-predict will produce output in the output_file.
`svm-scale' Usage
=================
Usage: svm-scale [options] data_filename
options:
-l lower : x scaling lower limit (default -1)
-u upper : x scaling upper limit (default +1)
-y y_lower y_upper : y scaling limits (default: no y scaling)
-s save_filename : save scaling parameters to save_filename
-r restore_filename : restore scaling parameters from restore_filename
See 'Examples' in this file for examples.
Tips on Practical Use
=====================
* Scale your data. For example, scale each attribute to [0,1] or [-1,+1].
* For C-SVC, consider using the model selection tool in the tools directory.
* nu in nu-SVC/one-class-SVM/nu-SVR approximates the fraction of training
errors and support vectors.
* If data for classification are unbalanced (e.g. many positive and
few negative), try different penalty parameters C by -wi (see
examples below).
* Specify larger cache size (i.e., larger -m) for huge problems.
Examples
========
> svm-scale -l -1 -u 1 -s range train > train.scale
> svm-scale -r range test > test.scale
Scale each feature of the training data to be in [-1,1]. Scaling
factors are stored in the file range and then used for scaling the
test data.
> svm-train -s 0 -c 5 -t 2 -g 0.5 -e 0.1 data_file
Train a classifier with RBF kernel exp(-0.5|u-v|^2), C=10, and
stopping tolerance 0.1.
> svm-train -s 3 -p 0.1 -t 0 data_file
Solve SVM regression with linear kernel u'v and epsilon=0.1
in the loss function.
> svm-train -c 10 -w1 1 -w-2 5 -w4 2 data_file
Train a classifier with penalty 10 = 1 * 10 for class 1, penalty 50 =
5 * 10 for class -2, and penalty 20 = 2 * 10 for class 4.
> svm-train -s 0 -c 100 -g 0.1 -v 5 data_file
Do five-fold cross validation for the classifier using
the parameters C = 100 and gamma = 0.1
> svm-train -s 0 -b 1 data_file
> svm-predict -b 1 test_file data_file.model output_file
Obtain a model with probability information and predict test data with
probability estimates
Precomputed Kernels
===================
Users may precompute kernel values and input them as training and
testing files. Then libsvm does not need the original
training/testing sets.
Assume there are L training instances x1, ..., xL and.
Let K(x, y) be the kernel
value of two instances x and y. The input formats
are:
New training instance for xi:
0:i 1:K(xi,x1) ... L:K(xi,xL)
New testing instance for any x:
0:? 1:K(x,x1) ... L:K(x,xL)
That is, in the training file the first column must be the "ID" of
xi. In testing, ? can be any value.
All kernel values including ZEROs must be explicitly provided. Any
permutation or random subsets of the training/testing files are also
valid (see examples below).
Note: the format is slightly different from the precomputed kernel
package released in libsvmtools earlier.
Examples:
Assume the original training data has three four-feature
instances and testing data has one instance:
15 1:1 2:1 3:1 4:1
45 2:3 4:3
25 3:1
15 1:1 3:1
If the linear kernel is used, we have the following new
training/testing sets:
15 0:1 1:4 2:6 3:1
45 0:2 1:6 2:18 3:0
25 0:3 1:1 2:0 3:1
15 0:? 1:2 2:0 3:1
? can be any value.
Any subset of the above training file is also valid. For example,
25 0:3 1:1 2:0 3:1
45 0:2 1:6 2:18 3:0
implies that the kernel matrix is
[K(2,2) K(2,3)] = [18 0]
[K(3,2) K(3,3)] = [0 1]
Library Usage
=============
These functions and structures are declared in the header file
`svm.h'. You need to #include "svm.h" in your C/C++ source files and
link your program with `svm.cpp'. You can see `svm-train.c' and
`svm-predict.c' for examples showing how to use them. We define
LIBSVM_VERSION and declare `extern int libsvm_version; ' in svm.h, so
you can check the version number.
Before you classify test data, you need to construct an SVM model
(`svm_model') using training data. A model can also be saved in
a file for later use. Once an SVM model is available, you can use it
to classify new data.
- Function: struct svm_model *svm_train(const struct svm_problem *prob,
const struct svm_parameter *param);
This function constructs and returns an SVM model according to
the given training data and parameters.
struct svm_problem describes the problem:
struct svm_problem
{
int l;
double *y;
struct svm_node **x;
};
where `l' is the number of training data, and `y' is an array containing
their target values. (integers in classification, real numbers in
regression) `x' is an array of pointers, each of which points to a sparse
representation (array of svm_node) of one training vector.
For example, if we have the following training data:
LABEL ATTR1 ATTR2 ATTR3 ATTR4 ATTR5
----- ----- ----- ----- ----- -----
1 0 0.1 0.2 0 0
2 0 0.1 0.3 -1.2 0
1 0.4 0 0 0 0
2 0 0.1 0 1.4 0.5
3 -0.1 -0.2 0.1 1.1 0.1
then the components of svm_problem are:
l = 5
y -> 1 2 1 2 3
x -> [ ] -> (2,0.1) (3,0.2) (-1,?)
[ ] -> (2,0.1) (3,0.3) (4,-1.2) (-1,?)
[ ] -> (1,0.4) (-1,?)
[ ] -> (2,0.1) (4,1.4) (5,0.5) (-1,?)
[ ] -> (1,-0.1) (2,-0.2) (3,0.1) (4,1.1) (5,0.1) (-1,?)
where (index,value) is stored in the structure `svm_node':
struct svm_node
{
int index;
double value;
};
index = -1 indicates the end of one vector. Note that indices must
be in ASCENDING order.
struct svm_parameter describes the parameters of an SVM model:
struct svm_parameter
{
int svm_type;
int kernel_type;
int degree; /* for poly */
double gamma; /* for poly/rbf/sigmoid */
double coef0; /* for poly/sigmoid */
/* these are for training only */
double cache_size; /* in MB */
double eps; /* stopping criteria */
double C; /* for C_SVC, EPSILON_SVR, and NU_SVR */
int nr_weight; /* for C_SVC */
int *weight_label; /* for C_SVC */
double* weight; /* for C_SVC */
double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */
double p; /* for EPSILON_SVR */
int shrinking; /* use the shrinking heuristics */
int probability; /* do probability estimates */
};
svm_type can be one of C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR.
C_SVC: C-SVM classification
NU_SVC: nu-SVM classification
ONE_CLASS: one-class-SVM
EPSILON_SVR: epsilon-SVM regression
NU_SVR: nu-SVM regression
kernel_type can be one of LINEAR, POLY, RBF, SIGMOID.
LINEAR: u'*v
POLY: (gamma*u'*v + coef0)^degree
RBF: exp(-gamma*|u-v|^2)
SIGMOID: tanh(gamma*u'*v + coef0)
PRECOMPUTED: kernel values in training_set_file
cache_size is the size of the kernel cache, specified in megabytes.
C is the cost of constraints violation.
eps is the stopping criterion. (we usually use 0.00001 in nu-SVC,
0.001 in others). nu is the parameter in nu-SVM, nu-SVR, and
one-class-SVM. p is the epsilon in epsilon-insensitive loss function
of epsilon-SVM regression. shrinking = 1 means shrinking is conducted;
= 0 otherwise. probability = 1 means model with probability
information is obtained; = 0 otherwise.
nr_weight, weight_label, and weight are used to change the penalty
for some classes (If the weight for a class is not changed, it is
set to 1). This is useful for training classifier using unbalanced
input data or with asymmetric misclassification cost.
nr_weight is the number of elements in the array weight_label and
weight. Each weight[i] corresponds to weight_label[i], meaning that
the penalty of class weight_label[i] is scaled by a factor of weight[i].
If you do not want to change penalty for any of the classes,
just set nr_weight to 0.
*NOTE* Because svm_model contains pointers to svm_problem, you can
not free the memory used by svm_problem if you are still using the
svm_model produced by svm_train().
*NOTE* To avoid wrong parameters, svm_check_parameter() should be
called before svm_train().
struct svm_model stores the model obtained from the training procedure.
It is not recommended to directly access entries in this structure.
Programmers should use the interface functions to get the values.
struct svm_model
{
struct svm_parameter param; /* parameter */
int nr_class; /* number of classes, = 2 in regression/one class svm */
int l; /* total #SV */
struct svm_node **SV; /* SVs (SV[l]) */
double **sv_coef; /* coefficients for SVs in decision functions (sv_coef[k-1][l]) */
double *rho; /* constants in decision functions (rho[k*(k-1)/2]) */
double *probA; /* pairwise probability information */
double *probB;
int *sv_indices; /* sv_indices[0,...,nSV-1] are values in [1,...,num_traning_data] to indicate SVs in the training set */
/* for classification only */
int *label; /* label of each class (label[k]) */
int *nSV; /* number of SVs for each class (nSV[k]) */
/* nSV[0] + nSV[1] + ... + nSV[k-1] = l */
/* XXX */
int free_sv; /* 1 if svm_model is created by svm_load_model*/
/* 0 if svm_model is created by svm_train */
};
param describes the parameters used to obtain the model.
nr_class is the number of classes. It is 2 for regression and one-class SVM.
l is the number of support vectors. SV and sv_coef are support
vectors and the corresponding coefficients, respectively. Assume there are
k classes. For data in class j, the corresponding sv_coef includes (k-1) y*alpha vectors,
where alpha's are solutions of the following two class problems:
1 vs j, 2 vs j, ..., j-1 vs j, j vs j+1, j vs j+2, ..., j vs k
and y=1 for the first j-1 vectors, while y=-1 for the remaining k-j
vectors. For example, if there are 4 classes, sv_coef and SV are like:
+-+-+-+--------------------+
|1|1|1| |
|v|v|v| SVs from class 1 |
|2|3|4| |
+-+-+-+--------------------+
|1|2|2| |
|v|v|v| SVs from class 2 |
|2|3|4| |
+-+-+-+--------------------+
|1|2|3| |
|v|v|v| SVs from class 3 |
|3|3|4| |
+-+-+-+--------------------+
|1|2|3| |
|v|v|v| SVs from class 4 |
|4|4|4| |
+-+-+-+--------------------+
See svm_train() for an example of assigning values to sv_coef.
rho is the bias term (-b). probA and probB are parameters used in
probability outputs. If there are k classes, there are k*(k-1)/2
binary problems as well as rho, probA, and probB values. They are
aligned in the order of binary problems:
1 vs 2, 1 vs 3, ..., 1 vs k, 2 vs 3, ..., 2 vs k, ..., k-1 vs k.
sv_indices[0,...,nSV-1] are values in [1,...,num_traning_data] to
indicate support vectors in the training set.
label contains labels in the training data.
nSV is the number of support vectors in each class.
free_sv is a flag used to determine whether the space of SV should
be released in free_model_content(struct svm_model*) and
free_and_destroy_model(struct svm_model**). If the model is
generated by svm_train(), then SV points to data in svm_problem
and should not be removed. For example, free_sv is 0 if svm_model
is created by svm_train, but is 0 if created by svm_load_model.
- Function: double svm_predict(const struct svm_model *model,
const struct svm_node *x);
This function does classification or regression on a test vector x
given a model.
For a classification model, the predicted class for x is returned.
For a regression model, the function value of x calculated using
the model is returned. For an one-class model, +1 or -1 is
returned.
- Function: void svm_cross_validation(const struct svm_problem *prob,
const struct svm_parameter *param, int nr_fold, double *target);
This function conducts cross validation. Data are separated to
nr_fold folds. Under given parameters, sequentially each fold is
validated using the model from training the remaining. Predicted
labels (of all prob's instances) in the validation process are
stored in the array called target.
The format of svm_prob is same as that for svm_train().
- Function: int svm_get_svm_type(const struct svm_model *model);
This function gives svm_type of the model. Possible values of
svm_type are defined in svm.h.
- Function: int svm_get_nr_class(const svm_model *model);
For a classification model, this function gives the number of
classes. For a regression or an one-class model, 2 is returned.
- Function: void svm_get_labels(const svm_model *model, int* label)
For a classification model, this function outputs the name of
labels into an array called label. For regression and one-class
models, label is unchanged.
- Function: void svm_get_sv_indices(const struct svm_model *model, int *sv_indices)
This function outputs indices of support vectors into an array called sv_indices.
The size of sv_indices is the number of support vectors and can be obtained by calling svm_get_nr_sv.
Each sv_indices[i] is in the range of [1, ..., num_traning_data].
- Function: int svm_get_nr_sv(const struct svm_model *model)
This function gives the number of total support vector.
- Function: double svm_get_svr_probability(const struct svm_model *model);
For a regression model with probability information, this function
outputs a value sigma > 0. For test data, we consider the
probability model: target value = predicted value + z, z: Laplace
distribution e^(-|z|/sigma)/(2sigma)
If the model is not for svr or does not contain required
information, 0 is returned.
- Function: double svm_predict_values(const svm_model *model,
const svm_node *x, double* dec_values)
This function gives decision values on a test vector x given a
model, and return the predicted label (classification) or
the function value (regression).
For a classification model with nr_class classes, this function
gives nr_class*(nr_class-1)/2 decision values in the array
dec_values, where nr_class can be obtained from the function
svm_get_nr_class. The order is label[0] vs. label[1], ...,
label[0] vs. label[nr_class-1], label[1] vs. label[2], ...,
label[nr_class-2] vs. label[nr_class-1], where label can be
obtained from the function svm_get_labels. The returned value is
the predicted class for x. Note that when nr_class = 1, this
function does not give any decision value.
For a regression model, dec_values[0] and the returned value are
both the function value of x calculated using the model. For a
one-class model, dec_values[0] is the decision value of x, while
the returned value is +1/-1.
- Function: double svm_predict_probability(const struct svm_model *model,
const struct svm_node *x, double* prob_estimates);
This function does classification or regression on a test vector x
given a model with probability information.
For a classification model with probability information, this
function gives nr_class probability estimates in the array
prob_estimates. nr_class can be obtained from the function
svm_get_nr_class. The class with the highest probability is
returned. For regression/one-class SVM, the array prob_estimates
is unchanged and the returned value is the same as that of
svm_predict.
- Function: const char *svm_check_parameter(const struct svm_problem *prob,
const struct svm_parameter *param);
This function checks whether the parameters are within the feasible
range of the problem. This function should be called before calling
svm_train() and svm_cross_validation(). It returns NULL if the
parameters are feasible, otherwise an error message is returned.
- Function: int svm_check_probability_model(const struct svm_model *model);
This function checks whether the model contains required
information to do probability estimates. If so, it returns
+1. Otherwise, 0 is returned. This function should be called
before calling svm_get_svr_probability and
svm_predict_probability.
- Function: int svm_save_model(const char *model_file_name,
const struct svm_model *model);
This function saves a model to a file; returns 0 on success, or -1
if an error occurs.
- Function: struct svm_model *svm_load_model(const char *model_file_name);
This function returns a pointer to the model read from the file,
or a null pointer if the model could not be loaded.
- Function: void svm_free_model_content(struct svm_model *model_ptr);
This function frees the memory used by the entries in a model structure.
- Function: void svm_free_and_destroy_model(struct svm_model **model_ptr_ptr);
This function frees the memory used by a model and destroys the model
structure. It is equivalent to svm_destroy_model, which
is deprecated after version 3.0.
- Function: void svm_destroy_param(struct svm_parameter *param);
This function frees the memory used by a parameter set.
- Function: void svm_set_print_string_function(void (*print_func)(const char *));
Users can specify their output format by a function. Use
svm_set_print_string_function(NULL);
for default printing to stdout.
Java Version
============
The pre-compiled java class archive `libsvm.jar' and its source files are
in the java directory. To run the programs, use
java -classpath libsvm.jar svm_train
java -classpath libsvm.jar svm_predict
java -classpath libsvm.jar svm_toy
java -classpath libsvm.jar svm_scale
Note that you need Java 1.5 (5.0) or above to run it.
You may need to add Java runtime library (like classes.zip) to the classpath.
You may need to increase maximum Java heap size.
Library usages are similar to the C version. These functions are available:
public class svm {
public static final int LIBSVM_VERSION=317;
public static svm_model svm_train(svm_problem prob, svm_parameter param);
public static void svm_cross_validation(svm_problem prob, svm_parameter param, int nr_fold, double[] target);
public static int svm_get_svm_type(svm_model model);
public static int svm_get_nr_class(svm_model model);
public static void svm_get_labels(svm_model model, int[] label);
public static void svm_get_sv_indices(svm_model model, int[] indices);
public static int svm_get_nr_sv(svm_model model);
public static double svm_get_svr_probability(svm_model model);
public static double svm_predict_values(svm_model model, svm_node[] x, double[] dec_values);
public static double svm_predict(svm_model model, svm_node[] x);
public static double svm_predict_probability(svm_model model, svm_node[] x, double[] prob_estimates);
public static void svm_save_model(String model_file_name, svm_model model) throws IOException
public static svm_model svm_load_model(String model_file_name) throws IOException
public static String svm_check_parameter(svm_problem prob, svm_parameter param);
public static int svm_check_probability_model(svm_model model);
public static void svm_set_print_string_function(svm_print_interface print_func);
}
The library is in the "libsvm" package.
Note that in Java version, svm_node[] is not ended with a node whose index = -1.
Users can specify their output format by
your_print_func = new svm_print_interface()
{
public void print(String s)
{
// your own format
}
};
svm.svm_set_print_string_function(your_print_func);
Building Windows Binaries
=========================
Windows binaries are in the directory `windows'. To build them via
Visual C++, use the following steps:
1. Open a DOS command box (or Visual Studio Command Prompt) and change
to libsvm directory. If environment variables of VC++ have not been
set, type
"C:\Program Files\Microsoft Visual Studio 10.0\VC\bin\vcvars32.bat"
You may have to modify the above command according which version of
VC++ or where it is installed.
2. Type
nmake -f Makefile.win clean all
3. (optional) To build shared library libsvm.dll, type
nmake -f Makefile.win lib
Another way is to build them from Visual C++ environment. See details
in libsvm FAQ.
- Additional Tools: Sub-sampling, Parameter Selection, Format checking, etc.
============================================================================
See the README file in the tools directory.
MATLAB/OCTAVE Interface
=======================
Please check the file README in the directory `matlab'.
Python Interface
================
See the README file in python directory.
Additional Information
======================
If you find LIBSVM helpful, please cite it as
Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support
vector machines. ACM Transactions on Intelligent Systems and
Technology, 2:27:1--27:27, 2011. Software available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm
LIBSVM implementation document is available at
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf
For any questions and comments, please email cjlin@csie.ntu.edu.tw
Acknowledgments:
This work was supported in part by the National Science
Council of Taiwan via the grant NSC 89-2213-E-002-013.
The authors thank their group members and users
for many helpful discussions and comments. They are listed in
http://www.csie.ntu.edu.tw/~cjlin/libsvm/acknowledgements
================================================
FILE: binaries/windows/x86/README-GPU
================================================
GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to
speed-up the training process. This package contains a new executable for
training classifiers "svm-train-gpu.exe" together with the original one.
The use of the new executable is exactly the same as with the original one.
This executable was built with the CUBLAS API version 2 which is compatible with SDKs from 4.0 and up.
To test the executable "svm-train-gpu" you can run the easy.py script which is located in the "tools" folder.
To observe speed improvements between CPU and GPU execution we provide a custom relatively large dataset (train_set) which can be used as an input to easy.py.
FEATURES
Mode Supported
* c-svc classification with RBF kernel
Functionality / User interface
* Same as LIBSVM
PREREQUISITES
* NVIDIA Graphics card with CUDA support
* Latest NVIDIA drivers for GPU
Additional Information
======================
If you find GPU-Accelerated LIBSVM helpful, please cite it as
A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines",
Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011.
Software available at http://mklab.iti.gr/project/GPU-LIBSVM
================================================
FILE: binaries/windows/x86/tools/README
================================================
This directory includes some useful codes:
1. subset selection tools.
2. parameter selection tools.
3. LIBSVM format checking tools
Part I: Subset selection tools
Introduction
============
Training large data is time consuming. Sometimes one should work on a
smaller subset first. The python script subset.py randomly selects a
specified number of samples. For classification data, we provide a
stratified selection to ensure the same class distribution in the
subset.
Usage: subset.py [options] dataset number [output1] [output2]
This script selects a subset of the given data set.
options:
-s method : method of selection (default 0)
0 -- stratified selection (classification only)
1 -- random selection
output1 : the subset (optional)
output2 : the rest of data (optional)
If output1 is omitted, the subset will be printed on the screen.
Example
=======
> python subset.py heart_scale 100 file1 file2
From heart_scale 100 samples are randomly selected and stored in
file1. All remaining instances are stored in file2.
Part II: Parameter Selection Tools
Introduction
============
grid.py is a parameter selection tool for C-SVM classification using
the RBF (radial basis function) kernel. It uses cross validation (CV)
technique to estimate the accuracy of each parameter combination in
the specified range and helps you to decide the best parameters for
your problem.
grid.py directly executes libsvm binaries (so no python binding is needed)
for cross validation and then draw contour of CV accuracy using gnuplot.
You must have libsvm and gnuplot installed before using it. The package
gnuplot is available at http://www.gnuplot.info/
On Mac OSX, the precompiled gnuplot file needs the library Aquarterm,
which thus must be installed as well. In addition, this version of
gnuplot does not support png, so you need to change "set term png
transparent small" and use other image formats. For example, you may
have "set term pbm small color".
Usage: grid.py [grid_options] [svm_options] dataset
grid_options :
-log2c {begin,end,step | "null"} : set the range of c (default -5,15,2)
begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end}
"null" -- do not grid with c
-log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2)
begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end}
"null" -- do not grid with g
-v n : n-fold cross validation (default 5)
-svmtrain pathname : set svm executable path and name
-gnuplot {pathname | "null"} :
pathname -- set gnuplot executable path and name
"null" -- do not plot
-out {pathname | "null"} : (default dataset.out)
pathname -- set output file path and name
"null" -- do not output file
-png pathname : set graphic output file path and name (default dataset.png)
-resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out)
Use this option only if some parameters have been checked for the SAME data.
svm_options : additional options for svm-train
The program conducts v-fold cross validation using parameter C (and gamma)
= 2^begin, 2^(begin+step), ..., 2^end.
You can specify where the libsvm executable and gnuplot are using the
-svmtrain and -gnuplot parameters.
For windows users, please use pgnuplot.exe. If you are using gnuplot
3.7.1, please upgrade to version 3.7.3 or higher. The version 3.7.1
has a bug. If you use cygwin on windows, please use gunplot-x11.
If the task is terminated accidentally or you would like to change the
range of parameters, you can apply '-resume' to save time by re-using
previous results. You may specify the output file of a previous run
or use the default (i.e., dataset.out) without giving a name. Please
note that the same condition must be used in two runs. For example,
you cannot use '-v 10' earlier and resume the task with '-v 5'.
The value of some options can be "null." For example, `-log2c -1,0,1
-log2 "null"' means that C=2^-1,2^0,2^1 and g=LIBSVM's default gamma
value. That is, you do not conduct parameter selection on gamma.
Example
=======
> python grid.py -log2c -5,5,1 -log2g -4,0,1 -v 5 -m 300 heart_scale
Users (in particular MS Windows users) may need to specify the path of
executable files. You can either change paths in the beginning of
grid.py or specify them in the command line. For example,
> grid.py -log2c -5,5,1 -svmtrain "c:\Program Files\libsvm\windows\svm-train.exe" -gnuplot c:\tmp\gnuplot\binary\pgnuplot.exe -v 10 heart_scale
Output: two files
dataset.png: the CV accuracy contour plot generated by gnuplot
dataset.out: the CV accuracy at each (log2(C),log2(gamma))
The following example saves running time by loading the output file of a previous run.
> python grid.py -log2c -7,7,1 -log2g -5,2,1 -v 5 -resume heart_scale.out heart_scale
Parallel grid search
====================
You can conduct a parallel grid search by dispatching jobs to a
cluster of computers which share the same file system. First, you add
machine names in grid.py:
ssh_workers = ["linux1", "linux5", "linux5"]
and then setup your ssh so that the authentication works without
asking a password.
The same machine (e.g., linux5 here) can be listed more than once if
it has multiple CPUs or has more RAM. If the local machine is the
best, you can also enlarge the nr_local_worker. For example:
nr_local_worker = 2
Example:
> python grid.py heart_scale
[local] -1 -1 78.8889 (best c=0.5, g=0.5, rate=78.8889)
[linux5] -1 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333)
[linux5] 5 -1 77.037 (best c=0.5, g=0.0078125, rate=83.3333)
[linux1] 5 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333)
.
.
.
If -log2c, -log2g, or -v is not specified, default values are used.
If your system uses telnet instead of ssh, you list the computer names
in telnet_workers.
Calling grid in Python
======================
In addition to using grid.py as a command-line tool, you can use it as a
Python module.
>>> rate, param = find_parameters(dataset, options)
You need to specify `dataset' and `options' (default ''). See the following example.
> python
>>> from grid import *
>>> rate, param = find_parameters('../heart_scale', '-log2c -1,1,1 -log2g -1,1,1')
[local] 0.0 0.0 rate=74.8148 (best c=1.0, g=1.0, rate=74.8148)
[local] 0.0 -1.0 rate=77.037 (best c=1.0, g=0.5, rate=77.037)
.
.
[local] -1.0 -1.0 rate=78.8889 (best c=0.5, g=0.5, rate=78.8889)
.
.
>>> rate
78.8889
>>> param
{'c': 0.5, 'g': 0.5}
Part III: LIBSVM format checking tools
Introduction
============
`svm-train' conducts only a simple check of the input data. To do a
detailed check, we provide a python script `checkdata.py.'
Usage: checkdata.py dataset
Exit status (returned value): 1 if there are errors, 0 otherwise.
This tool is written by Rong-En Fan at National Taiwan University.
Example
=======
> cat bad_data
1 3:1 2:4
> python checkdata.py bad_data
line 1: feature indices must be in an ascending order, previous/current features 3:1 2:4
Found 1 lines with error.
================================================
FILE: binaries/windows/x86/tools/checkdata.py
================================================
#!/usr/bin/env python
#
# A format checker for LIBSVM
#
#
# Copyright (c) 2007, Rong-En Fan
#
# All rights reserved.
#
# This program is distributed under the same license of the LIBSVM package.
#
from sys import argv, exit
import os.path
def err(line_no, msg):
print("line {0}: {1}".format(line_no, msg))
# works like float() but does not accept nan and inf
def my_float(x):
if x.lower().find("nan") != -1 or x.lower().find("inf") != -1:
raise ValueError
return float(x)
def main():
if len(argv) != 2:
print("Usage: {0} dataset".format(argv[0]))
exit(1)
dataset = argv[1]
if not os.path.exists(dataset):
print("dataset {0} not found".format(dataset))
exit(1)
line_no = 1
error_line_count = 0
for line in open(dataset, 'r'):
line_error = False
# each line must end with a newline character
if line[-1] != '\n':
err(line_no, "missing a newline character in the end")
line_error = True
nodes = line.split()
# check label
try:
label = nodes.pop(0)
if label.find(',') != -1:
# multi-label format
try:
for l in label.split(','):
l = my_float(l)
except:
err(line_no, "label {0} is not a valid multi-label form".format(label))
line_error = True
else:
try:
label = my_float(label)
except:
err(line_no, "label {0} is not a number".format(label))
line_error = True
except:
err(line_no, "missing label, perhaps an empty line?")
line_error = True
# check features
prev_index = -1
for i in range(len(nodes)):
try:
(index, value) = nodes[i].split(':')
index = int(index)
value = my_float(value)
# precomputed kernel's index starts from 0 and LIBSVM
# checks it. Hence, don't treat index 0 as an error.
if index < 0:
err(line_no, "feature index must be positive; wrong feature {0}".format(nodes[i]))
line_error = True
elif index <= prev_index:
err(line_no, "feature indices must be in an ascending order, previous/current features {0} {1}".format(nodes[i-1], nodes[i]))
line_error = True
prev_index = index
except:
err(line_no, "feature '{0}' not an : pair, integer, real number ".format(nodes[i]))
line_error = True
line_no += 1
if line_error:
error_line_count += 1
if error_line_count > 0:
print("Found {0} lines with error.".format(error_line_count))
return 1
else:
print("No error.")
return 0
if __name__ == "__main__":
exit(main())
================================================
FILE: binaries/windows/x86/tools/easy.py
================================================
#!/usr/bin/env python
import sys
import os
from subprocess import *
if len(sys.argv) <= 1:
print('Usage: {0} training_file [testing_file]'.format(sys.argv[0]))
raise SystemExit
# svm, grid, and gnuplot executable files
is_win32 = (sys.platform == 'win32')
if not is_win32:
svmscale_exe = "../svm-scale"
svmtrain_exe = "../svm-train-gpu"
svmpredict_exe = "../svm-predict"
grid_py = "./grid.py"
gnuplot_exe = "/usr/bin/gnuplot"
else:
# example for windows
svmscale_exe = r"..\windows\svm-scale.exe"
svmtrain_exe = r"..\windows\svm-train-gpu.exe"
svmpredict_exe = r"..\windows\svm-predict.exe"
gnuplot_exe = r"C:\Program Files (x86)\gnuplot\bin\pgnuplot.exe"
grid_py = r".\grid.py"
assert os.path.exists(svmscale_exe),"svm-scale executable not found"
assert os.path.exists(svmtrain_exe),"svm-train-gpu executable not found"
assert os.path.exists(svmpredict_exe),"svm-predict executable not found"
assert os.path.exists(gnuplot_exe),"gnuplot executable not found"
assert os.path.exists(grid_py),"grid.py not found"
train_pathname = sys.argv[1]
assert os.path.exists(train_pathname),"training file not found"
file_name = os.path.split(train_pathname)[1]
scaled_file = file_name + ".scale"
model_file = file_name + ".model"
range_file = file_name + ".range"
if len(sys.argv) > 2:
test_pathname = sys.argv[2]
file_name = os.path.split(test_pathname)[1]
assert os.path.exists(test_pathname),"testing file not found"
scaled_test_file = file_name + ".scale"
predict_test_file = file_name + ".predict"
cmd = '{0} -s "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, train_pathname, scaled_file)
print('Scaling training data...')
Popen(cmd, shell = True, stdout = PIPE).communicate()
cmd = '{0} -svmtrain "{1}" -gnuplot "{2}" "{3}"'.format(grid_py, svmtrain_exe, gnuplot_exe, scaled_file)
print('Cross validation...')
f = Popen(cmd, shell = True, stdout = PIPE).stdout
line = ''
while True:
last_line = line
line = f.readline()
if not line: break
c,g,rate = map(float,last_line.split())
print('Best c={0}, g={1} CV rate={2}'.format(c,g,rate))
cmd = '{0} -c {1} -g {2} "{3}" "{4}"'.format(svmtrain_exe,c,g,scaled_file,model_file)
print('Training...')
Popen(cmd, shell = True, stdout = PIPE).communicate()
print('Output model: {0}'.format(model_file))
if len(sys.argv) > 2:
cmd = '{0} -r "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, test_pathname, scaled_test_file)
print('Scaling testing data...')
Popen(cmd, shell = True, stdout = PIPE).communicate()
cmd = '{0} "{1}" "{2}" "{3}"'.format(svmpredict_exe, scaled_test_file, model_file, predict_test_file)
print('Testing...')
Popen(cmd, shell = True).communicate()
print('Output prediction: {0}'.format(predict_test_file))
================================================
FILE: binaries/windows/x86/tools/grid.py
================================================
#!/usr/bin/env python
__all__ = ['find_parameters']
import os, sys, traceback, getpass, time, re
from threading import Thread
from subprocess import *
if sys.version_info[0] < 3:
from Queue import Queue
else:
from queue import Queue
telnet_workers = []
ssh_workers = []
nr_local_worker = 1
class GridOption:
def __init__(self, dataset_pathname, options):
dirname = os.path.dirname(__file__)
if sys.platform != 'win32':
self.svmtrain_pathname = os.path.join(dirname, '../svm-train')
self.gnuplot_pathname = '/usr/bin/gnuplot'
else:
# example for windows
self.svmtrain_pathname = os.path.join(dirname, r'..\windows\svm-train.exe')
# svmtrain_pathname = r'c:\Program Files\libsvm\windows\svm-train.exe'
self.gnuplot_pathname = r'C:\Program Files (x86)\gnuplot\bin\pgnuplot.exe'
self.fold = 5
self.c_begin, self.c_end, self.c_step = -5, 15, 2
self.g_begin, self.g_end, self.g_step = 3, -15, -2
self.grid_with_c, self.grid_with_g = True, True
self.dataset_pathname = dataset_pathname
self.dataset_title = os.path.split(dataset_pathname)[1]
self.out_pathname = '{0}.out'.format(self.dataset_title)
self.png_pathname = '{0}.png'.format(self.dataset_title)
self.pass_through_string = ' '
self.resume_pathname = None
self.parse_options(options)
def parse_options(self, options):
if type(options) == str:
options = options.split()
i = 0
pass_through_options = []
while i < len(options):
if options[i] == '-log2c':
i = i + 1
if options[i] == 'null':
self.grid_with_c = False
else:
self.c_begin, self.c_end, self.c_step = map(float,options[i].split(','))
elif options[i] == '-log2g':
i = i + 1
if options[i] == 'null':
self.grid_with_g = False
else:
self.g_begin, self.g_end, self.g_step = map(float,options[i].split(','))
elif options[i] == '-v':
i = i + 1
self.fold = options[i]
elif options[i] in ('-c','-g'):
raise ValueError('Use -log2c and -log2g.')
elif options[i] == '-svmtrain':
i = i + 1
self.svmtrain_pathname = options[i]
elif options[i] == '-gnuplot':
i = i + 1
if options[i] == 'null':
self.gnuplot_pathname = None
else:
self.gnuplot_pathname = options[i]
elif options[i] == '-out':
i = i + 1
if options[i] == 'null':
self.out_pathname = None
else:
self.out_pathname = options[i]
elif options[i] == '-png':
i = i + 1
self.png_pathname = options[i]
elif options[i] == '-resume':
if i == (len(options)-1) or options[i+1].startswith('-'):
self.resume_pathname = self.dataset_title + '.out'
else:
i = i + 1
self.resume_pathname = options[i]
else:
pass_through_options.append(options[i])
i = i + 1
self.pass_through_string = ' '.join(pass_through_options)
if not os.path.exists(self.svmtrain_pathname):
raise IOError('svm-train executable not found')
if not os.path.exists(self.dataset_pathname):
raise IOError('dataset not found')
if self.resume_pathname and not os.path.exists(self.resume_pathname):
raise IOError('file for resumption not found')
if not self.grid_with_c and not self.grid_with_g:
raise ValueError('-log2c and -log2g should not be null simultaneously')
if self.gnuplot_pathname and not os.path.exists(self.gnuplot_pathname):
sys.stderr.write('gnuplot executable not found\n')
self.gnuplot_pathname = None
def redraw(db,best_param,gnuplot,options,tofile=False):
if len(db) == 0: return
begin_level = round(max(x[2] for x in db)) - 3
step_size = 0.5
best_log2c,best_log2g,best_rate = best_param
# if newly obtained c, g, or cv values are the same,
# then stop redrawing the contour.
if all(x[0] == db[0][0] for x in db): return
if all(x[1] == db[0][1] for x in db): return
if all(x[2] == db[0][2] for x in db): return
if tofile:
gnuplot.write(b"set term png transparent small linewidth 2 medium enhanced\n")
gnuplot.write("set output \"{0}\"\n".format(options.png_pathname.replace('\\','\\\\')).encode())
#gnuplot.write(b"set term postscript color solid\n")
#gnuplot.write("set output \"{0}.ps\"\n".format(options.dataset_title).encode().encode())
elif sys.platform == 'win32':
gnuplot.write(b"set term windows\n")
else:
gnuplot.write( b"set term x11\n")
gnuplot.write(b"set xlabel \"log2(C)\"\n")
gnuplot.write(b"set ylabel \"log2(gamma)\"\n")
gnuplot.write("set xrange [{0}:{1}]\n".format(options.c_begin,options.c_end).encode())
gnuplot.write("set yrange [{0}:{1}]\n".format(options.g_begin,options.g_end).encode())
gnuplot.write(b"set contour\n")
gnuplot.write("set cntrparam levels incremental {0},{1},100\n".format(begin_level,step_size).encode())
gnuplot.write(b"unset surface\n")
gnuplot.write(b"unset ztics\n")
gnuplot.write(b"set view 0,0\n")
gnuplot.write("set title \"{0}\"\n".format(options.dataset_title).encode())
gnuplot.write(b"unset label\n")
gnuplot.write("set label \"Best log2(C) = {0} log2(gamma) = {1} accuracy = {2}%\" \
at screen 0.5,0.85 center\n". \
format(best_log2c, best_log2g, best_rate).encode())
gnuplot.write("set label \"C = {0} gamma = {1}\""
" at screen 0.5,0.8 center\n".format(2**best_log2c, 2**best_log2g).encode())
gnuplot.write(b"set key at screen 0.9,0.9\n")
gnuplot.write(b"splot \"-\" with lines\n")
db.sort(key = lambda x:(x[0], -x[1]))
prevc = db[0][0]
for line in db:
if prevc != line[0]:
gnuplot.write(b"\n")
prevc = line[0]
gnuplot.write("{0[0]} {0[1]} {0[2]}\n".format(line).encode())
gnuplot.write(b"e\n")
gnuplot.write(b"\n") # force gnuplot back to prompt when term set failure
gnuplot.flush()
def calculate_jobs(options):
def range_f(begin,end,step):
# like range, but works on non-integer too
seq = []
while True:
if step > 0 and begin > end: break
if step < 0 and begin < end: break
seq.append(begin)
begin = begin + step
return seq
def permute_sequence(seq):
n = len(seq)
if n <= 1: return seq
mid = int(n/2)
left = permute_sequence(seq[:mid])
right = permute_sequence(seq[mid+1:])
ret = [seq[mid]]
while left or right:
if left: ret.append(left.pop(0))
if right: ret.append(right.pop(0))
return ret
c_seq = permute_sequence(range_f(options.c_begin,options.c_end,options.c_step))
g_seq = permute_sequence(range_f(options.g_begin,options.g_end,options.g_step))
if not options.grid_with_c:
c_seq = [None]
if not options.grid_with_g:
g_seq = [None]
nr_c = float(len(c_seq))
nr_g = float(len(g_seq))
i, j = 0, 0
jobs = []
while i < nr_c or j < nr_g:
if i/nr_c < j/nr_g:
# increase C resolution
line = []
for k in range(0,j):
line.append((c_seq[i],g_seq[k]))
i = i + 1
jobs.append(line)
else:
# increase g resolution
line = []
for k in range(0,i):
line.append((c_seq[k],g_seq[j]))
j = j + 1
jobs.append(line)
resumed_jobs = {}
if options.resume_pathname is None:
return jobs, resumed_jobs
for line in open(options.resume_pathname, 'r'):
line = line.strip()
rst = re.findall(r'rate=([0-9.]+)',line)
if not rst:
continue
rate = float(rst[0])
c, g = None, None
rst = re.findall(r'log2c=([0-9.-]+)',line)
if rst:
c = float(rst[0])
rst = re.findall(r'log2g=([0-9.-]+)',line)
if rst:
g = float(rst[0])
resumed_jobs[(c,g)] = rate
return jobs, resumed_jobs
class WorkerStopToken: # used to notify the worker to stop or if a worker is dead
pass
class Worker(Thread):
def __init__(self,name,job_queue,result_queue,options):
Thread.__init__(self)
self.name = name
self.job_queue = job_queue
self.result_queue = result_queue
self.options = options
def run(self):
while True:
(cexp,gexp) = self.job_queue.get()
if cexp is WorkerStopToken:
self.job_queue.put((cexp,gexp))
# print('worker {0} stop.'.format(self.name))
break
try:
c, g = None, None
if cexp != None:
c = 2.0**cexp
if gexp != None:
g = 2.0**gexp
rate = self.run_one(c,g)
if rate is None: raise RuntimeError('get no rate')
except:
# we failed, let others do that and we just quit
traceback.print_exception(sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2])
self.job_queue.put((cexp,gexp))
sys.stderr.write('worker {0} quit.\n'.format(self.name))
break
else:
self.result_queue.put((self.name,cexp,gexp,rate))
def get_cmd(self,c,g):
options=self.options
cmdline = options.svmtrain_pathname
if options.grid_with_c:
cmdline += ' -c {0} '.format(c)
if options.grid_with_g:
cmdline += ' -g {0} '.format(g)
cmdline += ' -v {0} {1} {2} '.format\
(options.fold,options.pass_through_string,options.dataset_pathname)
return cmdline
class LocalWorker(Worker):
def run_one(self,c,g):
cmdline = self.get_cmd(c,g)
result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout
for line in result.readlines():
if str(line).find('Cross') != -1:
return float(line.split()[-1][0:-1])
class SSHWorker(Worker):
def __init__(self,name,job_queue,result_queue,host,options):
Worker.__init__(self,name,job_queue,result_queue,options)
self.host = host
self.cwd = os.getcwd()
def run_one(self,c,g):
cmdline = 'ssh -x -t -t {0} "cd {1}; {2}"'.format\
(self.host,self.cwd,self.get_cmd(c,g))
result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout
for line in result.readlines():
if str(line).find('Cross') != -1:
return float(line.split()[-1][0:-1])
class TelnetWorker(Worker):
def __init__(self,name,job_queue,result_queue,host,username,password,options):
Worker.__init__(self,name,job_queue,result_queue,options)
self.host = host
self.username = username
self.password = password
def run(self):
import telnetlib
self.tn = tn = telnetlib.Telnet(self.host)
tn.read_until('login: ')
tn.write(self.username + '\n')
tn.read_until('Password: ')
tn.write(self.password + '\n')
# XXX: how to know whether login is successful?
tn.read_until(self.username)
#
print('login ok', self.host)
tn.write('cd '+os.getcwd()+'\n')
Worker.run(self)
tn.write('exit\n')
def run_one(self,c,g):
cmdline = self.get_cmd(c,g)
result = self.tn.write(cmdline+'\n')
(idx,matchm,output) = self.tn.expect(['Cross.*\n'])
for line in output.split('\n'):
if str(line).find('Cross') != -1:
return float(line.split()[-1][0:-1])
def find_parameters(dataset_pathname, options=''):
def update_param(c,g,rate,best_c,best_g,best_rate,worker,resumed):
if (rate > best_rate) or (rate==best_rate and g==best_g and c= 3:
xrange = range
def exit_with_help(argv):
print("""\
Usage: {0} [options] dataset subset_size [output1] [output2]
This script randomly selects a subset of the dataset.
options:
-s method : method of selection (default 0)
0 -- stratified selection (classification only)
1 -- random selection
output1 : the subset (optional)
output2 : rest of the data (optional)
If output1 is omitted, the subset will be printed on the screen.""".format(argv[0]))
exit(1)
def process_options(argv):
argc = len(argv)
if argc < 3:
exit_with_help(argv)
# default method is stratified selection
method = 0
subset_file = sys.stdout
rest_file = None
i = 1
while i < argc:
if argv[i][0] != "-":
break
if argv[i] == "-s":
i = i + 1
method = int(argv[i])
if method not in [0,1]:
print("Unknown selection method {0}".format(method))
exit_with_help(argv)
i = i + 1
dataset = argv[i]
subset_size = int(argv[i+1])
if i+2 < argc:
subset_file = open(argv[i+2],'w')
if i+3 < argc:
rest_file = open(argv[i+3],'w')
return dataset, subset_size, method, subset_file, rest_file
def random_selection(dataset, subset_size):
l = sum(1 for line in open(dataset,'r'))
return sorted(random.sample(xrange(l), subset_size))
def stratified_selection(dataset, subset_size):
labels = [line.split(None,1)[0] for line in open(dataset)]
label_linenums = defaultdict(list)
for i, label in enumerate(labels):
label_linenums[label] += [i]
l = len(labels)
remaining = subset_size
ret = []
# classes with fewer data are sampled first; otherwise
# some rare classes may not be selected
for label in sorted(label_linenums, key=lambda x: len(label_linenums[x])):
linenums = label_linenums[label]
label_size = len(linenums)
# at least one instance per class
s = int(min(remaining, max(1, math.ceil(label_size*(float(subset_size)/l)))))
if s == 0:
sys.stderr.write('''\
Error: failed to have at least one instance per class
1. You may have regression data.
2. Your classification data is unbalanced or too small.
Please use -s 1.
''')
sys.exit(-1)
remaining -= s
ret += [linenums[i] for i in random.sample(xrange(label_size), s)]
return sorted(ret)
def main(argv=sys.argv):
dataset, subset_size, method, subset_file, rest_file = process_options(argv)
#uncomment the following line to fix the random seed
#random.seed(0)
selected_lines = []
if method == 0:
selected_lines = stratified_selection(dataset, subset_size)
elif method == 1:
selected_lines = random_selection(dataset, subset_size)
#select instances based on selected_lines
dataset = open(dataset,'r')
prev_selected_linenum = -1
for i in xrange(len(selected_lines)):
for cnt in xrange(selected_lines[i]-prev_selected_linenum-1):
line = dataset.readline()
if rest_file:
rest_file.write(line)
subset_file.write(dataset.readline())
prev_selected_linenum = selected_lines[i]
subset_file.close()
if rest_file:
for line in dataset:
rest_file.write(line)
rest_file.close()
dataset.close()
if __name__ == '__main__':
main(sys.argv)
================================================
FILE: binaries/windows/x86/train_set
================================================
[File too large to display: 10.8 MB]
================================================
FILE: src/linux/COPYRIGHT
================================================
Copyright (c) 2000-2010 Chih-Chung Chang and Chih-Jen Lin
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither name of copyright holders nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
================================================
FILE: src/linux/Makefile
================================================
################################################################################
#
# Copyright 1993-2013 NVIDIA Corporation. All rights reserved.
#
# NOTICE TO USER:
#
# This source code is subject to NVIDIA ownership rights under U.S. and
# international Copyright laws.
#
# NVIDIA MAKES NO REPRESENTATION ABOUT THE SUITABILITY OF THIS SOURCE
# CODE FOR ANY PURPOSE. IT IS PROVIDED "AS IS" WITHOUT EXPRESS OR
# IMPLIED WARRANTY OF ANY KIND. NVIDIA DISCLAIMS ALL WARRANTIES WITH
# REGARD TO THIS SOURCE CODE, INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY, NONINFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE.
# IN NO EVENT SHALL NVIDIA BE LIABLE FOR ANY SPECIAL, INDIRECT, INCIDENTAL,
# OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
# OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
# OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE
# OR PERFORMANCE OF THIS SOURCE CODE.
#
# U.S. Government End Users. This source code is a "commercial item" as
# that term is defined at 48 C.F.R. 2.101 (OCT 1995), consisting of
# "commercial computer software" and "commercial computer software
# documentation" as such terms are used in 48 C.F.R. 12.212 (SEPT 1995)
# and is provided to the U.S. Government only as a commercial end item.
# Consistent with 48 C.F.R.12.212 and 48 C.F.R. 227.7202-1 through
# 227.7202-4 (JUNE 1995), all U.S. Government End Users acquire the
# source code with only those rights set forth herein.
#
################################################################################
#
# Makefile project only supported on Mac OS X and Linux Platforms)
#
################################################################################
include ./findcudalib.mk
# Location of the CUDA Toolkit
CUDA_PATH ?= "/usr/local/cuda-5.5"
# internal flags
NVCCFLAGS := -m${OS_SIZE} -maxrregcount=16
CCFLAGS :=
NVCCLDFLAGS :=
LDFLAGS :=
# Extra user flags
EXTRA_NVCCFLAGS ?=
EXTRA_NVCCLDFLAGS ?=
EXTRA_LDFLAGS ?=
EXTRA_CCFLAGS ?=
# OS-specific build flags
ifneq ($(DARWIN),)
LDFLAGS += -rpath $(CUDA_PATH)/lib
CCFLAGS += -arch $(OS_ARCH) $(STDLIB)
else
ifeq ($(OS_ARCH),armv7l)
ifeq ($(abi),gnueabi)
CCFLAGS += -mfloat-abi=softfp
else
# default to gnueabihf
override abi := gnueabihf
LDFLAGS += --dynamic-linker=/lib/ld-linux-armhf.so.3
CCFLAGS += -mfloat-abi=hard
endif
endif
endif
ifeq ($(ARMv7),1)
NVCCFLAGS += -target-cpu-arch ARM
ifneq ($(TARGET_FS),)
CCFLAGS += --sysroot=$(TARGET_FS)
LDFLAGS += --sysroot=$(TARGET_FS)
LDFLAGS += -rpath-link=$(TARGET_FS)/lib
LDFLAGS += -rpath-link=$(TARGET_FS)/usr/lib
LDFLAGS += -rpath-link=$(TARGET_FS)/usr/lib/arm-linux-$(abi)
endif
endif
# Debug build flags
ifeq ($(dbg),1)
NVCCFLAGS += -g -G
TARGET := debug
else
TARGET := release
endif
ALL_CCFLAGS :=
ALL_CCFLAGS += $(NVCCFLAGS)
ALL_CCFLAGS += $(addprefix -Xcompiler ,$(CCFLAGS))
ALL_CCFLAGS += $(EXTRA_NVCCFLAGS)
ALL_CCFLAGS += $(addprefix -Xcompiler ,$(EXTRA_CCFLAGS))
ALL_LDFLAGS :=
ALL_LDFLAGS += $(ALL_CCFLAGS)
ALL_LDFLAGS += $(NVCCLDFLAGS)
ALL_LDFLAGS += $(addprefix -Xlinker ,$(LDFLAGS))
ALL_LDFLAGS += $(EXTRA_NVCCLDFLAGS)
ALL_LDFLAGS += $(addprefix -Xlinker ,$(EXTRA_LDFLAGS))
# Common includes and paths for CUDA
INCLUDES := -I/usr/local/cuda-5.5/include
INCLUDES := -I/usr/local/cuda-5.5/samples/common/inc
LIBRARIES := -L/usr/local/cuda-5.5/lib64
################################################################################
LIBRARIES += -lcublas -lcudart
################################################################################
# CUDA code generation flags
ifneq ($(OS_ARCH),armv7l)
GENCODE_SM10 := -gencode arch=compute_10,code=sm_10
endif
GENCODE_SM20 := -gencode arch=compute_20,code=sm_20
GENCODE_SM30 := -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=\"sm_35,compute_35\"
GENCODE_FLAGS := $(GENCODE_SM10) $(GENCODE_SM20) $(GENCODE_SM30)
################################################################################
# Target rules
all: build
build: svm-train-gpu
svm-train-gpu.o: svm.cpp svm-train.o
$(NVCC) $(INCLUDES) $(ALL_CCFLAGS) $(GENCODE_FLAGS) -o $@ -c $<
svm-train-gpu: svm.o svm-train.o
$(NVCC) $(ALL_LDFLAGS) -o $@ $+ $(LIBRARIES)
mkdir -p /bin/$(OS_ARCH)/$(OSLOWER)/$(TARGET)$(if $(abi),/$(abi))
cp $@ /bin/$(OS_ARCH)/$(OSLOWER)/$(TARGET)$(if $(abi),/$(abi))
run: build
./svm-train-gpu
clean:
rm -f svm-train-gpu.o svm-train-gpu
rm -rf /bin/$(OS_ARCH)/$(OSLOWER)/$(TARGET)$(if $(abi),/$(abi))/svm-train-gpu
clobber: clean
================================================
FILE: src/linux/README
================================================
Libsvm is a simple, easy-to-use, and efficient software for SVM
classification and regression. It solves C-SVM classification, nu-SVM
classification, one-class-SVM, epsilon-SVM regression, and nu-SVM
regression. It also provides an automatic model selection tool for
C-SVM classification. This document explains the use of libsvm.
Libsvm is available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm
Please read the COPYRIGHT file before using libsvm.
Table of Contents
=================
- Quick Start
- Installation and Data Format
- `svm-train' Usage
- `svm-predict' Usage
- `svm-scale' Usage
- Tips on Practical Use
- Examples
- Precomputed Kernels
- Library Usage
- Java Version
- Building Windows Binaries
- Additional Tools: Sub-sampling, Parameter Selection, Format checking, etc.
- Python Interface
- Additional Information
Quick Start
===========
If you are new to SVM and if the data is not large, please go to
`tools' directory and use easy.py after installation. It does
everything automatic -- from data scaling to parameter selection.
Usage: easy.py training_file [testing_file]
More information about parameter selection can be found in
`tools/README.'
Installation and Data Format
============================
On Unix systems, type `make' to build the `svm-train' and `svm-predict'
programs. Run them without arguments to show the usages of them.
On other systems, consult `Makefile' to build them (e.g., see
'Building Windows binaries' in this file) or use the pre-built
binaries (Windows binaries are in the directory `windows').
The format of training and testing data file is:
: : ...
.
.
.
Each line contains an instance and is ended by a '\n' character. For
classification, is an integer indicating the class label
(multi-class is supported). For regression, is the target
value which can be any real number. For one-class SVM, it's not used
so can be any number. Except using precomputed kernels (explained in
another section), : gives a feature (attribute) value.
is an integer starting from 1 and is a real
number. Indices must be in ASCENDING order. Labels in the testing
file are only used to calculate accuracy or errors. If they are
unknown, just fill the first column with any numbers.
A sample classification data included in this package is
`heart_scale'. To check if your data is in a correct form, use
`tools/checkdata.py' (details in `tools/README').
Type `svm-train heart_scale', and the program will read the training
data and output the model file `heart_scale.model'. If you have a test
set called heart_scale.t, then type `svm-predict heart_scale.t
heart_scale.model output' to see the prediction accuracy. The `output'
file contains the predicted class labels.
There are some other useful programs in this package.
svm-scale:
This is a tool for scaling input data file.
svm-toy:
This is a simple graphical interface which shows how SVM
separate data in a plane. You can click in the window to
draw data points. Use "change" button to choose class
1, 2 or 3 (i.e., up to three classes are supported), "load"
button to load data from a file, "save" button to save data to
a file, "run" button to obtain an SVM model, and "clear"
button to clear the window.
You can enter options in the bottom of the window, the syntax of
options is the same as `svm-train'.
Note that "load" and "save" consider data in the
classification but not the regression case. Each data point
has one label (the color) which must be 1, 2, or 3 and two
attributes (x-axis and y-axis values) in [0,1].
Type `make' in respective directories to build them.
You need Qt library to build the Qt version.
(available from http://www.trolltech.com)
You need GTK+ library to build the GTK version.
(available from http://www.gtk.org)
The pre-built Windows binaries are in the `windows'
directory. We use Visual C++ on a 32-bit machine, so the
maximal cache size is 2GB.
`svm-train' Usage
=================
Usage: svm-train [options] training_set_file [model_file]
options:
-s svm_type : set type of SVM (default 0)
0 -- C-SVC
1 -- nu-SVC
2 -- one-class SVM
3 -- epsilon-SVR
4 -- nu-SVR
-t kernel_type : set type of kernel function (default 2)
0 -- linear: u'*v
1 -- polynomial: (gamma*u'*v + coef0)^degree
2 -- radial basis function: exp(-gamma*|u-v|^2)
3 -- sigmoid: tanh(gamma*u'*v + coef0)
4 -- precomputed kernel (kernel values in training_set_file)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n: n-fold cross validation mode
-q : quiet mode (no outputs)
The k in the -g option means the number of attributes in the input data.
option -v randomly splits the data into n parts and calculates cross
validation accuracy/mean squared error on them.
See libsvm FAQ for the meaning of outputs.
`svm-predict' Usage
===================
Usage: svm-predict [options] test_file model_file output_file
options:
-b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported
model_file is the model file generated by svm-train.
test_file is the test data you want to predict.
svm-predict will produce output in the output_file.
`svm-scale' Usage
=================
Usage: svm-scale [options] data_filename
options:
-l lower : x scaling lower limit (default -1)
-u upper : x scaling upper limit (default +1)
-y y_lower y_upper : y scaling limits (default: no y scaling)
-s save_filename : save scaling parameters to save_filename
-r restore_filename : restore scaling parameters from restore_filename
See 'Examples' in this file for examples.
Tips on Practical Use
=====================
* Scale your data. For example, scale each attribute to [0,1] or [-1,+1].
* For C-SVC, consider using the model selection tool in the tools directory.
* nu in nu-SVC/one-class-SVM/nu-SVR approximates the fraction of training
errors and support vectors.
* If data for classification are unbalanced (e.g. many positive and
few negative), try different penalty parameters C by -wi (see
examples below).
* Specify larger cache size (i.e., larger -m) for huge problems.
Examples
========
> svm-scale -l -1 -u 1 -s range train > train.scale
> svm-scale -r range test > test.scale
Scale each feature of the training data to be in [-1,1]. Scaling
factors are stored in the file range and then used for scaling the
test data.
> svm-train -s 0 -c 5 -t 2 -g 0.5 -e 0.1 data_file
Train a classifier with RBF kernel exp(-0.5|u-v|^2), C=10, and
stopping tolerance 0.1.
> svm-train -s 3 -p 0.1 -t 0 data_file
Solve SVM regression with linear kernel u'v and epsilon=0.1
in the loss function.
> svm-train -c 10 -w1 1 -w2 5 -w4 2 data_file
Train a classifier with penalty 10 = 1 * 10 for class 1, penalty 50 =
5 * 10 for class 2, and penalty 20 = 2 * 10 for class 4.
> svm-train -s 0 -c 100 -g 0.1 -v 5 data_file
Do five-fold cross validation for the classifier using
the parameters C = 100 and gamma = 0.1
> svm-train -s 0 -b 1 data_file
> svm-predict -b 1 test_file data_file.model output_file
Obtain a model with probability information and predict test data with
probability estimates
Precomputed Kernels
===================
Users may precompute kernel values and input them as training and
testing files. Then libsvm does not need the original
training/testing sets.
Assume there are L training instances x1, ..., xL and.
Let K(x, y) be the kernel
value of two instances x and y. The input formats
are:
New training instance for xi:
0:i 1:K(xi,x1) ... L:K(xi,xL)
New testing instance for any x:
0:? 1:K(x,x1) ... L:K(x,xL)
That is, in the training file the first column must be the "ID" of
xi. In testing, ? can be any value.
All kernel values including ZEROs must be explicitly provided. Any
permutation or random subsets of the training/testing files are also
valid (see examples below).
Note: the format is slightly different from the precomputed kernel
package released in libsvmtools earlier.
Examples:
Assume the original training data has three four-feature
instances and testing data has one instance:
15 1:1 2:1 3:1 4:1
45 2:3 4:3
25 3:1
15 1:1 3:1
If the linear kernel is used, we have the following new
training/testing sets:
15 0:1 1:4 2:6 3:1
45 0:2 1:6 2:18 3:0
25 0:3 1:1 2:0 3:1
15 0:? 1:2 2:0 3:1
? can be any value.
Any subset of the above training file is also valid. For example,
25 0:3 1:1 2:0 3:1
45 0:2 1:6 2:18 3:0
implies that the kernel matrix is
[K(2,2) K(2,3)] = [18 0]
[K(3,2) K(3,3)] = [0 1]
Library Usage
=============
These functions and structures are declared in the header file
`svm.h'. You need to #include "svm.h" in your C/C++ source files and
link your program with `svm.cpp'. You can see `svm-train.c' and
`svm-predict.c' for examples showing how to use them. We define
LIBSVM_VERSION and declare `extern int libsvm_version; ' in svm.h, so
you can check the version number.
Before you classify test data, you need to construct an SVM model
(`svm_model') using training data. A model can also be saved in
a file for later use. Once an SVM model is available, you can use it
to classify new data.
- Function: struct svm_model *svm_train(const struct svm_problem *prob,
const struct svm_parameter *param);
This function constructs and returns an SVM model according to
the given training data and parameters.
struct svm_problem describes the problem:
struct svm_problem
{
int l;
double *y;
struct svm_node **x;
};
where `l' is the number of training data, and `y' is an array containing
their target values. (integers in classification, real numbers in
regression) `x' is an array of pointers, each of which points to a sparse
representation (array of svm_node) of one training vector.
For example, if we have the following training data:
LABEL ATTR1 ATTR2 ATTR3 ATTR4 ATTR5
----- ----- ----- ----- ----- -----
1 0 0.1 0.2 0 0
2 0 0.1 0.3 -1.2 0
1 0.4 0 0 0 0
2 0 0.1 0 1.4 0.5
3 -0.1 -0.2 0.1 1.1 0.1
then the components of svm_problem are:
l = 5
y -> 1 2 1 2 3
x -> [ ] -> (2,0.1) (3,0.2) (-1,?)
[ ] -> (2,0.1) (3,0.3) (4,-1.2) (-1,?)
[ ] -> (1,0.4) (-1,?)
[ ] -> (2,0.1) (4,1.4) (5,0.5) (-1,?)
[ ] -> (1,-0.1) (2,-0.2) (3,0.1) (4,1.1) (5,0.1) (-1,?)
where (index,value) is stored in the structure `svm_node':
struct svm_node
{
int index;
double value;
};
index = -1 indicates the end of one vector. Note that indices must
be in ASCENDING order.
struct svm_parameter describes the parameters of an SVM model:
struct svm_parameter
{
int svm_type;
int kernel_type;
int degree; /* for poly */
double gamma; /* for poly/rbf/sigmoid */
double coef0; /* for poly/sigmoid */
/* these are for training only */
double cache_size; /* in MB */
double eps; /* stopping criteria */
double C; /* for C_SVC, EPSILON_SVR, and NU_SVR */
int nr_weight; /* for C_SVC */
int *weight_label; /* for C_SVC */
double* weight; /* for C_SVC */
double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */
double p; /* for EPSILON_SVR */
int shrinking; /* use the shrinking heuristics */
int probability; /* do probability estimates */
};
svm_type can be one of C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR.
C_SVC: C-SVM classification
NU_SVC: nu-SVM classification
ONE_CLASS: one-class-SVM
EPSILON_SVR: epsilon-SVM regression
NU_SVR: nu-SVM regression
kernel_type can be one of LINEAR, POLY, RBF, SIGMOID.
LINEAR: u'*v
POLY: (gamma*u'*v + coef0)^degree
RBF: exp(-gamma*|u-v|^2)
SIGMOID: tanh(gamma*u'*v + coef0)
PRECOMPUTED: kernel values in training_set_file
cache_size is the size of the kernel cache, specified in megabytes.
C is the cost of constraints violation.
eps is the stopping criterion. (we usually use 0.00001 in nu-SVC,
0.001 in others). nu is the parameter in nu-SVM, nu-SVR, and
one-class-SVM. p is the epsilon in epsilon-insensitive loss function
of epsilon-SVM regression. shrinking = 1 means shrinking is conducted;
= 0 otherwise. probability = 1 means model with probability
information is obtained; = 0 otherwise.
nr_weight, weight_label, and weight are used to change the penalty
for some classes (If the weight for a class is not changed, it is
set to 1). This is useful for training classifier using unbalanced
input data or with asymmetric misclassification cost.
nr_weight is the number of elements in the array weight_label and
weight. Each weight[i] corresponds to weight_label[i], meaning that
the penalty of class weight_label[i] is scaled by a factor of weight[i].
If you do not want to change penalty for any of the classes,
just set nr_weight to 0.
*NOTE* Because svm_model contains pointers to svm_problem, you can
not free the memory used by svm_problem if you are still using the
svm_model produced by svm_train().
*NOTE* To avoid wrong parameters, svm_check_parameter() should be
called before svm_train().
struct svm_model stores the model obtained from the training procedure.
It is not recommended to directly access entries in this structure.
Programmers should use the interface functions to get the values.
struct svm_model
{
struct svm_parameter param; /* parameter */
int nr_class; /* number of classes, = 2 in regression/one class svm */
int l; /* total #SV */
struct svm_node **SV; /* SVs (SV[l]) */
double **sv_coef; /* coefficients for SVs in decision functions (sv_coef[k-1][l]) */
double *rho; /* constants in decision functions (rho[k*(k-1)/2]) */
double *probA; /* pairwise probability information */
double *probB;
/* for classification only */
int *label; /* label of each class (label[k]) */
int *nSV; /* number of SVs for each class (nSV[k]) */
/* nSV[0] + nSV[1] + ... + nSV[k-1] = l */
/* XXX */
int free_sv; /* 1 if svm_model is created by svm_load_model*/
/* 0 if svm_model is created by svm_train */
};
param describes the parameters used to obtain the model.
nr_class is the number of classes. It is 2 for regression and one-class SVM.
l is the number of support vectors. SV and sv_coef are support
vectors and the corresponding coefficients, respectively. Assume there are
k classes. For data in class j, the corresponding sv_coef includes (k-1) y*alpha vectors,
where alpha's are solutions of the following two class problems:
1 vs j, 2 vs j, ..., j-1 vs j, j vs j+1, j vs j+2, ..., j vs k
and y=1 for the first j-1 vectors, while y=-1 for the remaining k-j
vectors. For example, if there are 4 classes, sv_coef and SV are like:
+-+-+-+--------------------+
|1|1|1| |
|v|v|v| SVs from class 1 |
|2|3|4| |
+-+-+-+--------------------+
|1|2|2| |
|v|v|v| SVs from class 2 |
|2|3|4| |
+-+-+-+--------------------+
|1|2|3| |
|v|v|v| SVs from class 3 |
|3|3|4| |
+-+-+-+--------------------+
|1|2|3| |
|v|v|v| SVs from class 4 |
|4|4|4| |
+-+-+-+--------------------+
See svm_train() for an example of assigning values to sv_coef.
rho is the bias term (-b). probA and probB are parameters used in
probability outputs. If there are k classes, there are k*(k-1)/2
binary problems as well as rho, probA, and probB values. They are
aligned in the order of binary problems:
1 vs 2, 1 vs 3, ..., 1 vs k, 2 vs 3, ..., 2 vs k, ..., k-1 vs k.
label contains labels in the training data.
nSV is the number of support vectors in each class.
free_sv is a flag used to determine whether the space of SV should
be released in free_model_content(struct svm_model*) and
free_and_destroy_model(struct svm_model**). If the model is
generated by svm_train(), then SV points to data in svm_problem
and should not be removed. For example, free_sv is 0 if svm_model
is created by svm_train, but is 0 if created by svm_load_model.
- Function: double svm_predict(const struct svm_model *model,
const struct svm_node *x);
This function does classification or regression on a test vector x
given a model.
For a classification model, the predicted class for x is returned.
For a regression model, the function value of x calculated using
the model is returned. For an one-class model, +1 or -1 is
returned.
- Function: void svm_cross_validation(const struct svm_problem *prob,
const struct svm_parameter *param, int nr_fold, double *target);
This function conducts cross validation. Data are separated to
nr_fold folds. Under given parameters, sequentially each fold is
validated using the model from training the remaining. Predicted
labels (of all prob's instances) in the validation process are
stored in the array called target.
The format of svm_prob is same as that for svm_train().
- Function: int svm_get_svm_type(const struct svm_model *model);
This function gives svm_type of the model. Possible values of
svm_type are defined in svm.h.
- Function: int svm_get_nr_class(const svm_model *model);
For a classification model, this function gives the number of
classes. For a regression or an one-class model, 2 is returned.
- Function: void svm_get_labels(const svm_model *model, int* label)
For a classification model, this function outputs the name of
labels into an array called label. For regression and one-class
models, label is unchanged.
- Function: double svm_get_svr_probability(const struct svm_model *model);
For a regression model with probability information, this function
outputs a value sigma > 0. For test data, we consider the
probability model: target value = predicted value + z, z: Laplace
distribution e^(-|z|/sigma)/(2sigma)
If the model is not for svr or does not contain required
information, 0 is returned.
- Function: double svm_predict_values(const svm_model *model,
const svm_node *x, double* dec_values)
This function gives decision values on a test vector x given a
model, and return the predicted label (classification) or
the function value (regression).
For a classification model with nr_class classes, this function
gives nr_class*(nr_class-1)/2 decision values in the array
dec_values, where nr_class can be obtained from the function
svm_get_nr_class. The order is label[0] vs. label[1], ...,
label[0] vs. label[nr_class-1], label[1] vs. label[2], ...,
label[nr_class-2] vs. label[nr_class-1], where label can be
obtained from the function svm_get_labels. The returned value is
the predicted class for x.
For a regression model, dec_values[0] and the returned value are
both the function value of x calculated using the model. For a
one-class model, dec_values[0] is the decision value of x, while
the returned value is +1/-1.
- Function: double svm_predict_probability(const struct svm_model *model,
const struct svm_node *x, double* prob_estimates);
This function does classification or regression on a test vector x
given a model with probability information.
For a classification model with probability information, this
function gives nr_class probability estimates in the array
prob_estimates. nr_class can be obtained from the function
svm_get_nr_class. The class with the highest probability is
returned. For regression/one-class SVM, the array prob_estimates
is unchanged and the returned value is the same as that of
svm_predict.
- Function: const char *svm_check_parameter(const struct svm_problem *prob,
const struct svm_parameter *param);
This function checks whether the parameters are within the feasible
range of the problem. This function should be called before calling
svm_train() and svm_cross_validation(). It returns NULL if the
parameters are feasible, otherwise an error message is returned.
- Function: int svm_check_probability_model(const struct svm_model *model);
This function checks whether the model contains required
information to do probability estimates. If so, it returns
+1. Otherwise, 0 is returned. This function should be called
before calling svm_get_svr_probability and
svm_predict_probability.
- Function: int svm_save_model(const char *model_file_name,
const struct svm_model *model);
This function saves a model to a file; returns 0 on success, or -1
if an error occurs.
- Function: struct svm_model *svm_load_model(const char *model_file_name);
This function returns a pointer to the model read from the file,
or a null pointer if the model could not be loaded.
- Function: void svm_free_model_content(struct svm_model *model_ptr);
This function frees the memory used by the entries in a model structure.
- Function: void svm_free_and_destroy_model(struct svm_model **model_ptr_ptr);
This function frees the memory used by a model and destroys the model
structure. It is equivalent to svm_destroy_model, which
is deprecated after version 3.0.
- Function: void svm_destroy_param(struct svm_parameter *param);
This function frees the memory used by a parameter set.
- Function: void svm_set_print_string_function(void (*print_func)(const char *));
Users can specify their output format by a function. Use
svm_set_print_string_function(NULL);
for default printing to stdout.
Java Version
============
The pre-compiled java class archive `libsvm.jar' and its source files are
in the java directory. To run the programs, use
java -classpath libsvm.jar svm_train
java -classpath libsvm.jar svm_predict
java -classpath libsvm.jar svm_toy
java -classpath libsvm.jar svm_scale
Note that you need Java 1.5 (5.0) or above to run it.
You may need to add Java runtime library (like classes.zip) to the classpath.
You may need to increase maximum Java heap size.
Library usages are similar to the C version. These functions are available:
public class svm {
public static final int LIBSVM_VERSION=300;
public static svm_model svm_train(svm_problem prob, svm_parameter param);
public static void svm_cross_validation(svm_problem prob, svm_parameter param, int nr_fold, double[] target);
public static int svm_get_svm_type(svm_model model);
public static int svm_get_nr_class(svm_model model);
public static void svm_get_labels(svm_model model, int[] label);
public static double svm_get_svr_probability(svm_model model);
public static double svm_predict_values(svm_model model, svm_node[] x, double[] dec_values);
public static double svm_predict(svm_model model, svm_node[] x);
public static double svm_predict_probability(svm_model model, svm_node[] x, double[] prob_estimates);
public static void svm_save_model(String model_file_name, svm_model model) throws IOException
public static svm_model svm_load_model(String model_file_name) throws IOException
public static String svm_check_parameter(svm_problem prob, svm_parameter param);
public static int svm_check_probability_model(svm_model model);
public static void svm_set_print_string_function(svm_print_interface print_func);
}
The library is in the "libsvm" package.
Note that in Java version, svm_node[] is not ended with a node whose index = -1.
Users can specify their output format by
your_print_func = new svm_print_interface()
{
public void print(String s)
{
// your own format
}
};
svm.svm_set_print_string_function(your_print_func);
Building Windows Binaries
=========================
Windows binaries are in the directory `windows'. To build them via
Visual C++, use the following steps:
1. Open a DOS command box (or Visual Studio Command Prompt) and change
to libsvm directory. If environment variables of VC++ have not been
set, type
"C:\Program Files\Microsoft Visual Studio 10.0\VC\bin\vcvars32.bat"
You may have to modify the above command according which version of
VC++ or where it is installed.
2. Type
nmake -f Makefile.win clean all
3. (optional) To build shared library libsvm.dll, type
nmake -f Makefile.win lib
Another way is to build them from Visual C++ environment. See details
in libsvm FAQ.
- Additional Tools: Sub-sampling, Parameter Selection, Format checking, etc.
============================================================================
See the README file in the tools directory.
Python Interface
================
See the README file in python directory.
Additional Information
======================
If you find LIBSVM helpful, please cite it as
Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for
support vector machines, 2001.
Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
LIBSVM implementation document is available at
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf
For any questions and comments, please email cjlin@csie.ntu.edu.tw
Acknowledgments:
This work was supported in part by the National Science
Council of Taiwan via the grant NSC 89-2213-E-002-013.
The authors thank their group members and users
for many helpful discussions and comments. They are listed in
http://www.csie.ntu.edu.tw/~cjlin/libsvm/acknowledgements
================================================
FILE: src/linux/README-GPU
================================================
GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to
speed-up the training process. This package contains a new executable for
training classifiers "svm-train-gpu.exe" together with the original one.
The use of the new executable is exactly the same as with the original one.
FEATURES
Mode Supported
* c-svc classification with RBF kernel
Functionality / User interface
* Same as LIBSVM
PREREQUISITES
* NVIDIA Graphics card with CUDA support
* Latest NVIDIA drivers for GPU
* CUDA toolkit & GPU Computing SDK 5.5
Download all in one package from:
https://developer.nvidia.com/cuda-downloads
INSTRUCTIONS
1. Install the NVIDIA drivers, CUDA toolkit and GPU Computing SDK code samples. You can find them all in one package here:
https://developer.nvidia.com/cuda-downloads (Version 5.5)
You may need some additional packets to be installed in order to complete the installation above.
A very helpful and descriptive guide is on the CUDA webpage:
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html
Make sure you have followed every step that is relevant to your system, like declaring $PATH and $LD_LIBRARY_PATH on your bash configuration file.
2. Copy this folder anywhere you like.
3. Use the Makefile found inside this folder.
4. Find the "svm-train-gpu" executable inside this folder.
Troubleshooting
1. Nearly all problems are resolved by reading carefully through the nvidia guidelines.
2. When making, there is a chance a "cannot find cublas_v2.h" or "cuda_runtime.h" error to arise. Find where these files are located (Default path is: "/usr/local/cuda-5.5/include") and replace the paths on the header files in "kernel_matrix_calculation.c" file with your system paths. Also you can change the location of the default CUDA toolkit path on the makefile (CUDA_PATH ?= "/usr/local/cuda-5.5") to your cuda toolkit path.
Additional Information
======================
If you find GPU-Accelerated LIBSVM helpful, please cite it as
A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines",
Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011.
Software available at http://mklab.iti.gr/project/GPU-LIBSVM
================================================
FILE: src/linux/cross_validation_with_matrix_precomputation.c
================================================
void setup_pkm(struct svm_problem *p_km)
{
int i;
p_km->l = prob.l;
p_km->x = Malloc(struct svm_node,p_km->l);
p_km->y = Malloc(double,p_km->l);
for(i=0;ix+i)->values = Malloc(double,prob.l+1);
(p_km->x+i)->dim = prob.l+1;
}
for( i=0; iy[i] = prob.y[i];
}
void free_pkm(struct svm_problem *p_km)
{
int i;
for(i=0;ix+i)->values);
free( p_km->x );
free( p_km->y );
}
double do_crossvalidation(struct svm_problem * p_km)
{
double rate;
int i;
int total_correct = 0;
double total_error = 0;
double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0;
double *target = Malloc(double,prob.l);
svm_cross_validation(p_km,¶m,nr_fold,target);
if(param.svm_type == EPSILON_SVR ||
param.svm_type == NU_SVR)
{
for(i=0;i/dev/null | tr "[:lower:]" "[:upper:]")
OSLOWER = $(shell uname -s 2>/dev/null | tr "[:upper:]" "[:lower:]")
# Flags to detect 32-bit or 64-bit OS platform
OS_SIZE = $(shell uname -m | sed -e "s/i.86/32/" -e "s/x86_64/64/" -e "s/armv7l/32/")
OS_ARCH = $(shell uname -m | sed -e "s/i386/i686/")
# Determine OS platform and unix distribution
ifeq ("$(OSLOWER)","linux")
# first search lsb_release
DISTRO = $(shell lsb_release -i -s 2>/dev/null | tr "[:upper:]" "[:lower:]")
DISTVER = $(shell lsb_release -r -s 2>/dev/null)
ifeq ("$(DISTRO)",'')
# second search and parse /etc/issue
DISTRO = $(shell more /etc/issue | awk '{print $$1}' | sed '1!d' | sed -e "/^$$/d" 2>/dev/null | tr "[:upper:]" "[:lower:]")
DISTVER= $(shell more /etc/issue | awk '{print $$2}' | sed '1!d' 2>/dev/null
endif
ifeq ("$(DISTRO)",'')
# third, we can search in /etc/os-release or /etc/{distro}-release
DISTRO = $(shell awk '/ID/' /etc/*-release | sed 's/ID=//' | grep -v "VERSION" | grep -v "ID" | grep -v "DISTRIB")
DISTVER= $(shell awk '/DISTRIB_RELEASE/' /etc/*-release | sed 's/DISTRIB_RELEASE=//' | grep -v "DISTRIB_RELEASE")
endif
endif
# search at Darwin (unix based info)
DARWIN = $(strip $(findstring DARWIN, $(OSUPPER)))
ifneq ($(DARWIN),)
SNOWLEOPARD = $(strip $(findstring 10.6, $(shell egrep "10\.6" /System/Library/CoreServices/SystemVersion.plist)))
LION = $(strip $(findstring 10.7, $(shell egrep "10\.7" /System/Library/CoreServices/SystemVersion.plist)))
MOUNTAIN = $(strip $(findstring 10.8, $(shell egrep "10\.8" /System/Library/CoreServices/SystemVersion.plist)))
MAVERICKS = $(strip $(findstring 10.9, $(shell egrep "10\.9" /System/Library/CoreServices/SystemVersion.plist)))
endif
# Common binaries
GCC ?= g++
CLANG ?= /usr/bin/clang
ifeq ("$(OSUPPER)","LINUX")
NVCC ?= $(CUDA_PATH)/bin/nvcc -ccbin $(GCC)
else
# for some newer versions of XCode, CLANG is the default compiler, so we need to include this
ifneq ($(MAVERICKS),)
NVCC ?= $(CUDA_PATH)/bin/nvcc -ccbin $(CLANG)
STDLIB ?= -stdlib=libstdc++
else
NVCC ?= $(CUDA_PATH)/bin/nvcc -ccbin $(GCC)
endif
endif
# Take command line flags that override any of these settings
ifeq ($(i386),1)
OS_SIZE = 32
OS_ARCH = i686
endif
ifeq ($(x86_64),1)
OS_SIZE = 64
OS_ARCH = x86_64
endif
ifeq ($(ARMv7),1)
OS_SIZE = 32
OS_ARCH = armv7l
endif
ifeq ("$(OSUPPER)","LINUX")
# Each Linux Distribuion has a set of different paths. This applies especially when using the Linux RPM/debian packages
ifeq ("$(DISTRO)","ubuntu")
CUDAPATH ?= /usr/lib/nvidia-current
CUDALINK ?= -L/usr/lib/nvidia-current
DFLT_PATH = /usr/lib
endif
ifeq ("$(DISTRO)","kubuntu")
CUDAPATH ?= /usr/lib/nvidia-current
CUDALINK ?= -L/usr/lib/nvidia-current
DFLT_PATH = /usr/lib
endif
ifeq ("$(DISTRO)","debian")
CUDAPATH ?= /usr/lib/nvidia-current
CUDALINK ?= -L/usr/lib/nvidia-current
DFLT_PATH = /usr/lib
endif
ifeq ("$(DISTRO)","suse")
ifeq ($(OS_SIZE),64)
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib
endif
endif
ifeq ("$(DISTRO)","suse linux")
ifeq ($(OS_SIZE),64)
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib
endif
endif
ifeq ("$(DISTRO)","opensuse")
ifeq ($(OS_SIZE),64)
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib
endif
endif
ifeq ("$(DISTRO)","fedora")
ifeq ($(OS_SIZE),64)
CUDAPATH ?= /usr/lib64/nvidia
CUDALINK ?= -L/usr/lib64/nvidia
DFLT_PATH = /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib
endif
endif
ifeq ("$(DISTRO)","redhat")
ifeq ($(OS_SIZE),64)
CUDAPATH ?= /usr/lib64/nvidia
CUDALINK ?= -L/usr/lib64/nvidia
DFLT_PATH = /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib
endif
endif
ifeq ("$(DISTRO)","red")
ifeq ($(OS_SIZE),64)
CUDAPATH ?= /usr/lib64/nvidia
CUDALINK ?= -L/usr/lib64/nvidia
DFLT_PATH = /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib
endif
endif
ifeq ("$(DISTRO)","redhatenterpriseworkstation")
ifeq ($(OS_SIZE),64)
CUDAPATH ?= /usr/lib64/nvidia
CUDALINK ?= -L/usr/lib64/nvidia
DFLT_PATH ?= /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH ?= /usr/lib
endif
endif
ifeq ("$(DISTRO)","centos")
ifeq ($(OS_SIZE),64)
CUDAPATH ?= /usr/lib64/nvidia
CUDALINK ?= -L/usr/lib64/nvidia
DFLT_PATH = /usr/lib64
else
CUDAPATH ?=
CUDALINK ?=
DFLT_PATH = /usr/lib
endif
endif
ifeq ($(ARMv7),1)
CUDAPATH := /usr/arm-linux-gnueabihf/lib
CUDALINK := -L/usr/arm-linux-gnueabihf/lib
ifneq ($(TARGET_FS),)
CUDAPATH += $(TARGET_FS)/usr/lib/nvidia-current
CUDALINK += -L$(TARGET_FS)/usr/lib/nvidia-current
endif
endif
# Search for Linux distribution path for libcuda.so
CUDALIB ?= $(shell find $(CUDAPATH) $(DFLT_PATH) -name libcuda.so -print 2>/dev/null)
ifeq ("$(CUDALIB)",'')
$(info >>> WARNING - CUDA Driver libcuda.so is not found. Please check and re-install the NVIDIA driver. <<<)
EXEC=@echo "[@]"
endif
else
# This would be the Mac OS X path if we had to do anything special
endif
================================================
FILE: src/linux/kernel_matrix_calculation.c
================================================
#include "/usr/local/cuda-5.5/include/cuda_runtime.h"
#include "/usr/local/cuda-5.5/include/cublas_v2.h"
// Scalars
const float alpha = 1;
const float beta = 0;
void ckm( struct svm_problem *prob, struct svm_problem *pecm, float *gamma )
{
cublasStatus_t status;
double g_val = *gamma;
long int nfa;
int len_tv;
int ntv;
int i_v;
int i_el;
int i_r, i_c;
int trvei;
double *tv_sq;
double *v_f_g;
float *tr_ar;
float *tva, *vtm, *DP;
float *g_tva = 0, *g_vtm = 0, *g_DotProd = 0;
cudaError_t cudaStat;
cublasHandle_t handle;
status = cublasCreate(&handle);
len_tv = prob-> x[0].dim;
ntv = prob-> l;
nfa = len_tv * ntv;
tva = (float*) malloc ( len_tv * ntv* sizeof(float) );
vtm = (float*) malloc ( len_tv * sizeof(float) );
DP = (float*) malloc ( ntv * sizeof(float) );
tr_ar = (float*) malloc ( len_tv * ntv* sizeof(float) );
tv_sq = (double*) malloc ( ntv * sizeof(double) );
v_f_g = (double*) malloc ( ntv * sizeof(double) );
for ( i_r = 0; i_r < ntv ; i_r++ )
{
for ( i_c = 0; i_c < len_tv; i_c++ )
tva[i_r * len_tv + i_c] = (float)prob-> x[i_r].values[i_c];
}
cudaStat = cudaMalloc((void**)&g_tva, len_tv * ntv * sizeof(float));
if (cudaStat != cudaSuccess) {
free( tva );
free( vtm );
free( DP );
free( v_f_g );
free( tv_sq );
cudaFree( g_tva );
cublasDestroy( handle );
fprintf (stderr, "!!!! Device memory allocation error (A)\n");
getchar();
return;
}
cudaStat = cudaMalloc((void**)&g_vtm, len_tv * sizeof(float));
cudaStat = cudaMalloc((void**)&g_DotProd, ntv * sizeof(float));
for( i_r = 0; i_r < ntv; i_r++ )
for( i_c = 0; i_c < len_tv; i_c++ )
tr_ar[i_c * ntv + i_r] = tva[i_r * len_tv + i_c];
// Copy cpu vector to gpu vector
status = cublasSetVector( len_tv * ntv, sizeof(float), tr_ar, 1, g_tva, 1 );
free( tr_ar );
for( i_v = 0; i_v < ntv; i_v++ )
{
tv_sq[ i_v ] = 0;
for( i_el = 0; i_el < len_tv; i_el++ )
tv_sq[i_v] += pow( tva[i_v*len_tv + i_el], (float)2.0 );
}
for ( trvei = 0; trvei < ntv; trvei++ )
{
status = cublasSetVector( len_tv, sizeof(float), &tva[trvei * len_tv], 1, g_vtm, 1 );
status = cublasSgemv( handle, CUBLAS_OP_N, ntv, len_tv, &alpha, g_tva, ntv , g_vtm, 1, &beta, g_DotProd, 1 );
status = cublasGetVector( ntv, sizeof(float), g_DotProd, 1, DP, 1 );
for ( i_c = 0; i_c < ntv; i_c++ )
v_f_g[i_c] = exp( -g_val * (tv_sq[trvei] + tv_sq[i_c]-((double)2.0)* (double)DP[i_c] ));
pecm-> x[trvei].values[0] = trvei + 1;
for ( i_c = 0; i_c < ntv; i_c++ )
pecm-> x[trvei].values[i_c + 1] = v_f_g[i_c];
}
free( tva );
free( vtm );
free( DP );
free( v_f_g );
free( tv_sq );
cudaFree( g_tva );
cudaFree( g_vtm );
cudaFree( g_DotProd );
cublasDestroy( handle );
}
void cal_km( struct svm_problem * p_km)
{
float gamma = param.gamma;
ckm(&prob, p_km, &gamma);
}
================================================
FILE: src/linux/readme.txt
================================================
Instructions to compile Linux GPU-Accelerated LIBSVM
1. Install the NVIDIA drivers, CUDA toolkit and GPU Computing SDK code samples. You can find them all in one package here:
https://developer.nvidia.com/cuda-downloads (Version 5.5)
You may need some additional packets to be installed in order to complete the installation above.
A very helpful and descriptive guide is on the CUDA webpage:
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html
Make sure you have followed every step that is relevant to your system, like declaring $PATH and $LD_LIBRARY_PATH on your bash configuration file.
2. Copy this folder anywhere you like.
3. Use the Makefile found inside this folder.
4. Find the "svm-train-gpu" executable inside this folder.
Troubleshooting
1. Nearly all problems are resolved by reading carefully through the nvidia guidelines.
2. When making, there is a chance a "cannot find cublas_v2.h" or "cuda_runtime.h" error to arise. Find where these files are located (Default path is: "/usr/local/cuda-5.5/include") and replace the paths on the header files in "kernel_matrix_calculation.c" file with your system paths.
Or you can change the location of the default CUDA toolkit path on the makefile (CUDA_PATH ?= "/usr/local/cuda-5.5") to your cuda toolkit path
================================================
FILE: src/linux/svm-train.c
================================================
#include
#include
#include
#include
#include
#include "svm.h"
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
void print_null(const char *s) {}
void exit_with_help()
{
printf(
"Usage: svm-train [options] training_set_file [model_file]\n"
"options:\n"
"-s svm_type : set type of SVM (default 0)\n"
" 0 -- C-SVC (multi-class classification)\n"
" 1 -- nu-SVC (multi-class classification)\n"
" 2 -- one-class SVM\n"
" 3 -- epsilon-SVR (regression)\n"
" 4 -- nu-SVR (regression)\n"
"-t kernel_type : set type of kernel function (default 2)\n"
" 0 -- linear: u'*v\n"
" 1 -- polynomial: (gamma*u'*v + coef0)^degree\n"
" 2 -- radial basis function: exp(-gamma*|u-v|^2)\n"
" 3 -- sigmoid: tanh(gamma*u'*v + coef0)\n"
" 4 -- precomputed kernel (kernel values in training_set_file)\n"
"-d degree : set degree in kernel function (default 3)\n"
"-g gamma : set gamma in kernel function (default 1/num_features)\n"
"-r coef0 : set coef0 in kernel function (default 0)\n"
"-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)\n"
"-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)\n"
"-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)\n"
"-m cachesize : set cache memory size in MB (default 100)\n"
"-e epsilon : set tolerance of termination criterion (default 0.001)\n"
"-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)\n"
"-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)\n"
"-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)\n"
"-v n: n-fold cross validation mode\n"
"-q : quiet mode (no outputs)\n"
);
exit(1);
}
void exit_input_error(int line_num)
{
fprintf(stderr,"Wrong input format at line %d\n", line_num);
exit(1);
}
void parse_command_line(int argc, char **argv, char *input_file_name, char *model_file_name);
void read_problem(const char *filename);
void do_cross_validation();
struct svm_parameter param; // set by parse_command_line
struct svm_problem prob; // set by read_problem
struct svm_model *model;
struct svm_node *x_space;
int cross_validation;
int nr_fold;
static char *line = NULL;
static int max_line_len;
#include "kernel_matrix_calculation.c"
#include "cross_validation_with_matrix_precomputation.c"
static char* readline(FILE *input)
{
int len;
if(fgets(line,max_line_len,input) == NULL)
return NULL;
while(strrchr(line,'\n') == NULL)
{
max_line_len *= 2;
line = (char *) realloc(line,max_line_len);
len = (int) strlen(line);
if(fgets(line+len,max_line_len-len,input) == NULL)
break;
}
return line;
}
int main(int argc, char **argv)
{
int i;
char input_file_name[1024];
char model_file_name[1024];
const char *error_msg;
parse_command_line(argc, argv, input_file_name, model_file_name);
read_problem(input_file_name);
error_msg = svm_check_parameter(&prob,¶m);
if(error_msg)
{
fprintf(stderr,"ERROR: %s\n",error_msg);
exit(1);
}
if(cross_validation)
{
do_cross_validation_with_KM_precalculated( );
// do_cross_validation();
}
else
{
model = svm_train(&prob,¶m);
if(svm_save_model(model_file_name,model))
{
fprintf(stderr, "can't save model to file %s\n", model_file_name);
exit(1);
}
svm_free_and_destroy_model(&model);
}
svm_destroy_param(¶m);
free(prob.y);
#ifdef _DENSE_REP
for (i = 0; i < prob.l; ++i)
free((prob.x+i)->values);
#else
free(x_space);
#endif
free(prob.x);
free(line);
return 0;
}
void do_cross_validation()
{
int i;
int total_correct = 0;
double total_error = 0;
double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0;
double *target = Malloc(double,prob.l);
svm_cross_validation(&prob,¶m,nr_fold,target);
if(param.svm_type == EPSILON_SVR ||
param.svm_type == NU_SVR)
{
for(i=0;i=argc)
exit_with_help();
switch(argv[i-1][1])
{
case 's':
param.svm_type = atoi(argv[i]);
break;
case 't':
param.kernel_type = atoi(argv[i]);
break;
case 'd':
param.degree = atoi(argv[i]);
break;
case 'g':
param.gamma = atof(argv[i]);
break;
case 'r':
param.coef0 = atof(argv[i]);
break;
case 'n':
param.nu = atof(argv[i]);
break;
case 'm':
param.cache_size = atof(argv[i]);
break;
case 'c':
param.C = atof(argv[i]);
break;
case 'e':
param.eps = atof(argv[i]);
break;
case 'p':
param.p = atof(argv[i]);
break;
case 'h':
param.shrinking = atoi(argv[i]);
break;
case 'b':
param.probability = atoi(argv[i]);
break;
case 'q':
print_func = &print_null;
i--;
break;
case 'v':
cross_validation = 1;
nr_fold = atoi(argv[i]);
if(nr_fold < 2)
{
fprintf(stderr,"n-fold cross validation: n must >= 2\n");
exit_with_help();
}
break;
case 'w':
++param.nr_weight;
param.weight_label = (int *)realloc(param.weight_label,sizeof(int)*param.nr_weight);
param.weight = (double *)realloc(param.weight,sizeof(double)*param.nr_weight);
param.weight_label[param.nr_weight-1] = atoi(&argv[i-1][2]);
param.weight[param.nr_weight-1] = atof(argv[i]);
break;
default:
fprintf(stderr,"Unknown option: -%c\n", argv[i-1][1]);
exit_with_help();
}
}
svm_set_print_string_function(print_func);
// determine filenames
if(i>=argc)
exit_with_help();
strcpy(input_file_name, argv[i]);
if(i line)
p--;
if(p > line)
max_index = (int) strtol(p,&endptr,10) + 1;
}
if(max_index > elements)
elements = max_index;
++prob.l;
}
rewind(fp);
prob.y = Malloc(double,prob.l);
prob.x = Malloc(struct svm_node,prob.l);
for(i=0;ivalues = Malloc(double,elements);
(prob.x+i)->dim = 0;
inst_max_index = -1; // strtol gives 0 if wrong format, and precomputed kernel has start from 0
readline(fp);
label = strtok(line," \t");
prob.y[i] = strtod(label,&endptr);
if(endptr == label)
exit_input_error(i+1);
while(1)
{
idx = strtok(NULL,":");
val = strtok(NULL," \t");
if(val == NULL)
break;
errno = 0;
j = (int) strtol(idx,&endptr,10);
if(endptr == idx || errno != 0 || *endptr != '\0' || j <= inst_max_index)
exit_input_error(i+1);
else
inst_max_index = j;
errno = 0;
value = strtod(val,&endptr);
if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr)))
exit_input_error(i+1);
d = &((prob.x+i)->dim);
while (*d < j)
(prob.x+i)->values[(*d)++] = 0.0;
(prob.x+i)->values[(*d)++] = value;
}
}
max_index = elements-1;
#else
while(readline(fp)!=NULL)
{
char *p = strtok(line," \t"); // label
// features
while(1)
{
p = strtok(NULL," \t");
if(p == NULL || *p == '\n') // check '\n' as ' ' may be after the last feature
break;
++elements;
}
++elements;
++prob.l;
}
rewind(fp);
prob.y = Malloc(double,prob.l);
prob.x = Malloc(struct svm_node *,prob.l);
x_space = Malloc(struct svm_node,elements);
max_index = 0;
j=0;
for(i=0;i start from 0
readline(fp);
prob.x[i] = &x_space[j];
label = strtok(line," \t\n");
if(label == NULL) // empty line
exit_input_error(i+1);
prob.y[i] = strtod(label,&endptr);
if(endptr == label || *endptr != '\0')
exit_input_error(i+1);
while(1)
{
idx = strtok(NULL,":");
val = strtok(NULL," \t");
if(val == NULL)
break;
errno = 0;
x_space[j].index = (int) strtol(idx,&endptr,10);
if(endptr == idx || errno != 0 || *endptr != '\0' || x_space[j].index <= inst_max_index)
exit_input_error(i+1);
else
inst_max_index = x_space[j].index;
errno = 0;
x_space[j].value = strtod(val,&endptr);
if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr)))
exit_input_error(i+1);
++j;
}
if(inst_max_index > max_index)
max_index = inst_max_index;
x_space[j++].index = -1;
}
#endif
if(param.gamma == 0 && max_index > 0)
param.gamma = 1.0/max_index;
if(param.kernel_type == PRECOMPUTED)
for(i=0;idim == 0 || (prob.x+i)->values[0] == 0.0)
{
fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n");
exit(1);
}
if ((int)(prob.x+i)->values[0] < 0 || (int)(prob.x+i)->values[0] > max_index)
{
fprintf(stderr,"Wrong input format: sample_serial_number out of range\n");
exit(1);
}
#else
if (prob.x[i][0].index != 0)
{
fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n");
exit(1);
}
if ((int)prob.x[i][0].value <= 0 || (int)prob.x[i][0].value > max_index)
{
fprintf(stderr,"Wrong input format: sample_serial_number out of range\n");
exit(1);
}
#endif
}
fclose(fp);
}
================================================
FILE: src/linux/svm.cpp
================================================
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include "svm.h"
int libsvm_version = LIBSVM_VERSION;
typedef float Qfloat;
typedef signed char schar;
#ifndef min
template static inline T min(T x,T y) { return (x static inline T max(T x,T y) { return (x>y)?x:y; }
#endif
template static inline void swap(T& x, T& y) { T t=x; x=y; y=t; }
template static inline void clone(T*& dst, S* src, int n)
{
dst = new T[n];
memcpy((void *)dst,(void *)src,sizeof(T)*n);
}
static inline double powi(double base, int times)
{
double tmp = base, ret = 1.0;
for(int t=times; t>0; t/=2)
{
if(t%2==1) ret*=tmp;
tmp = tmp * tmp;
}
return ret;
}
#define INF HUGE_VAL
#define TAU 1e-12
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
static void print_string_stdout(const char *s)
{
fputs(s,stdout);
fflush(stdout);
}
static void (*svm_print_string) (const char *) = &print_string_stdout;
#if 1
static void info(const char *fmt,...)
{
char buf[BUFSIZ];
va_list ap;
va_start(ap,fmt);
vsprintf(buf,fmt,ap);
va_end(ap);
(*svm_print_string)(buf);
}
#else
static void info(const char *fmt,...) {}
#endif
//
// Kernel Cache
//
// l is the number of total data items
// size is the cache size limit in bytes
//
class Cache
{
public:
Cache(int l,long int size);
~Cache();
// request data [0,len)
// return some position p where [p,len) need to be filled
// (p >= len if nothing needs to be filled)
int get_data(const int index, Qfloat **data, int len);
void swap_index(int i, int j);
private:
int l;
long int size;
struct head_t
{
head_t *prev, *next; // a circular list
Qfloat *data;
int len; // data[0,len) is cached in this entry
};
head_t *head;
head_t lru_head;
void lru_delete(head_t *h);
void lru_insert(head_t *h);
};
Cache::Cache(int l_,long int size_):l(l_),size(size_)
{
head = (head_t *)calloc(l,sizeof(head_t)); // initialized to 0
size /= sizeof(Qfloat);
size -= l * sizeof(head_t) / sizeof(Qfloat);
size = max(size, 2 * (long int) l); // cache must be large enough for two columns
lru_head.next = lru_head.prev = &lru_head;
}
Cache::~Cache()
{
for(head_t *h = lru_head.next; h != &lru_head; h=h->next)
free(h->data);
free(head);
}
void Cache::lru_delete(head_t *h)
{
// delete from current location
h->prev->next = h->next;
h->next->prev = h->prev;
}
void Cache::lru_insert(head_t *h)
{
// insert to last position
h->next = &lru_head;
h->prev = lru_head.prev;
h->prev->next = h;
h->next->prev = h;
}
int Cache::get_data(const int index, Qfloat **data, int len)
{
head_t *h = &head[index];
if(h->len) lru_delete(h);
int more = len - h->len;
if(more > 0)
{
// free old space
while(size < more)
{
head_t *old = lru_head.next;
lru_delete(old);
free(old->data);
size += old->len;
old->data = 0;
old->len = 0;
}
// allocate new space
h->data = (Qfloat *)realloc(h->data,sizeof(Qfloat)*len);
size -= more;
swap(h->len,len);
}
lru_insert(h);
*data = h->data;
return len;
}
void Cache::swap_index(int i, int j)
{
if(i==j) return;
if(head[i].len) lru_delete(&head[i]);
if(head[j].len) lru_delete(&head[j]);
swap(head[i].data,head[j].data);
swap(head[i].len,head[j].len);
if(head[i].len) lru_insert(&head[i]);
if(head[j].len) lru_insert(&head[j]);
if(i>j) swap(i,j);
for(head_t *h = lru_head.next; h!=&lru_head; h=h->next)
{
if(h->len > i)
{
if(h->len > j)
swap(h->data[i],h->data[j]);
else
{
// give up
lru_delete(h);
free(h->data);
size += h->len;
h->data = 0;
h->len = 0;
}
}
}
}
//
// Kernel evaluation
//
// the static method k_function is for doing single kernel evaluation
// the constructor of Kernel prepares to calculate the l*l kernel matrix
// the member function get_Q is for getting one column from the Q Matrix
//
class QMatrix {
public:
virtual Qfloat *get_Q(int column, int len) const = 0;
virtual double *get_QD() const = 0;
virtual void swap_index(int i, int j) const = 0;
virtual ~QMatrix() {}
};
class Kernel: public QMatrix {
public:
#ifdef _DENSE_REP
Kernel(int l, svm_node * x, const svm_parameter& param);
#else
Kernel(int l, svm_node * const * x, const svm_parameter& param);
#endif
virtual ~Kernel();
static double k_function(const svm_node *x, const svm_node *y,
const svm_parameter& param);
virtual Qfloat *get_Q(int column, int len) const = 0;
virtual double *get_QD() const = 0;
virtual void swap_index(int i, int j) const // no so const...
{
swap(x[i],x[j]);
if(x_square) swap(x_square[i],x_square[j]);
}
protected:
double (Kernel::*kernel_function)(int i, int j) const;
private:
#ifdef _DENSE_REP
svm_node *x;
#else
const svm_node **x;
#endif
double *x_square;
// svm_parameter
const int kernel_type;
const int degree;
const double gamma;
const double coef0;
static double dot(const svm_node *px, const svm_node *py);
#ifdef _DENSE_REP
static double dot(const svm_node &px, const svm_node &py);
#endif
double kernel_linear(int i, int j) const
{
return dot(x[i],x[j]);
}
double kernel_poly(int i, int j) const
{
return powi(gamma*dot(x[i],x[j])+coef0,degree);
}
double kernel_rbf(int i, int j) const
{
return exp(-gamma*(x_square[i]+x_square[j]-2*dot(x[i],x[j])));
}
double kernel_sigmoid(int i, int j) const
{
return tanh(gamma*dot(x[i],x[j])+coef0);
}
double kernel_precomputed(int i, int j) const
{
#ifdef _DENSE_REP
return (x+i)->values[(int)((x+j)->values[0])];
#else
return x[i][(int)(x[j][0].value)].value;
#endif
}
};
#ifdef _DENSE_REP
Kernel::Kernel(int l, svm_node * x_, const svm_parameter& param)
#else
Kernel::Kernel(int l, svm_node * const * x_, const svm_parameter& param)
#endif
:kernel_type(param.kernel_type), degree(param.degree),
gamma(param.gamma), coef0(param.coef0)
{
switch(kernel_type)
{
case LINEAR:
kernel_function = &Kernel::kernel_linear;
break;
case POLY:
kernel_function = &Kernel::kernel_poly;
break;
case RBF:
kernel_function = &Kernel::kernel_rbf;
break;
case SIGMOID:
kernel_function = &Kernel::kernel_sigmoid;
break;
case PRECOMPUTED:
kernel_function = &Kernel::kernel_precomputed;
break;
}
clone(x,x_,l);
if(kernel_type == RBF)
{
x_square = new double[l];
for(int i=0;idim, py->dim);
for (int i = 0; i < dim; i++)
sum += (px->values)[i] * (py->values)[i];
return sum;
}
double Kernel::dot(const svm_node &px, const svm_node &py)
{
double sum = 0;
int dim = min(px.dim, py.dim);
for (int i = 0; i < dim; i++)
sum += px.values[i] * py.values[i];
return sum;
}
#else
double Kernel::dot(const svm_node *px, const svm_node *py)
{
double sum = 0;
while(px->index != -1 && py->index != -1)
{
if(px->index == py->index)
{
sum += px->value * py->value;
++px;
++py;
}
else
{
if(px->index > py->index)
++py;
else
++px;
}
}
return sum;
}
#endif
double Kernel::k_function(const svm_node *x, const svm_node *y,
const svm_parameter& param)
{
switch(param.kernel_type)
{
case LINEAR:
return dot(x,y);
case POLY:
return powi(param.gamma*dot(x,y)+param.coef0,param.degree);
case RBF:
{
double sum = 0;
#ifdef _DENSE_REP
int dim = min(x->dim, y->dim), i;
for (i = 0; i < dim; i++)
{
double d = x->values[i] - y->values[i];
sum += d*d;
}
for (; i < x->dim; i++)
sum += x->values[i] * x->values[i];
for (; i < y->dim; i++)
sum += y->values[i] * y->values[i];
#else
while(x->index != -1 && y->index !=-1)
{
if(x->index == y->index)
{
double d = x->value - y->value;
sum += d*d;
++x;
++y;
}
else
{
if(x->index > y->index)
{
sum += y->value * y->value;
++y;
}
else
{
sum += x->value * x->value;
++x;
}
}
}
while(x->index != -1)
{
sum += x->value * x->value;
++x;
}
while(y->index != -1)
{
sum += y->value * y->value;
++y;
}
#endif
return exp(-param.gamma*sum);
}
case SIGMOID:
return tanh(param.gamma*dot(x,y)+param.coef0);
case PRECOMPUTED: //x: test (validation), y: SV
#ifdef _DENSE_REP
return x->values[(int)(y->values[0])];
#else
return x[(int)(y->value)].value;
#endif
default:
return 0; // Unreachable
}
}
// An SMO algorithm in Fan et al., JMLR 6(2005), p. 1889--1918
// Solves:
//
// min 0.5(\alpha^T Q \alpha) + p^T \alpha
//
// y^T \alpha = \delta
// y_i = +1 or -1
// 0 <= alpha_i <= Cp for y_i = 1
// 0 <= alpha_i <= Cn for y_i = -1
//
// Given:
//
// Q, p, y, Cp, Cn, and an initial feasible point \alpha
// l is the size of vectors and matrices
// eps is the stopping tolerance
//
// solution will be put in \alpha, objective value will be put in obj
//
class Solver {
public:
Solver() {};
virtual ~Solver() {};
struct SolutionInfo {
double obj;
double rho;
double upper_bound_p;
double upper_bound_n;
double r; // for Solver_NU
};
void Solve(int l, const QMatrix& Q, const double *p_, const schar *y_,
double *alpha_, double Cp, double Cn, double eps,
SolutionInfo* si, int shrinking);
protected:
int active_size;
schar *y;
double *G; // gradient of objective function
enum { LOWER_BOUND, UPPER_BOUND, FREE };
char *alpha_status; // LOWER_BOUND, UPPER_BOUND, FREE
double *alpha;
const QMatrix *Q;
const double *QD;
double eps;
double Cp,Cn;
double *p;
int *active_set;
double *G_bar; // gradient, if we treat free variables as 0
int l;
bool unshrink; // XXX
double get_C(int i)
{
return (y[i] > 0)? Cp : Cn;
}
void update_alpha_status(int i)
{
if(alpha[i] >= get_C(i))
alpha_status[i] = UPPER_BOUND;
else if(alpha[i] <= 0)
alpha_status[i] = LOWER_BOUND;
else alpha_status[i] = FREE;
}
bool is_upper_bound(int i) { return alpha_status[i] == UPPER_BOUND; }
bool is_lower_bound(int i) { return alpha_status[i] == LOWER_BOUND; }
bool is_free(int i) { return alpha_status[i] == FREE; }
void swap_index(int i, int j);
void reconstruct_gradient();
virtual int select_working_set(int &i, int &j);
virtual double calculate_rho();
virtual void do_shrinking();
private:
bool be_shrunk(int i, double Gmax1, double Gmax2);
};
void Solver::swap_index(int i, int j)
{
Q->swap_index(i,j);
swap(y[i],y[j]);
swap(G[i],G[j]);
swap(alpha_status[i],alpha_status[j]);
swap(alpha[i],alpha[j]);
swap(p[i],p[j]);
swap(active_set[i],active_set[j]);
swap(G_bar[i],G_bar[j]);
}
void Solver::reconstruct_gradient()
{
// reconstruct inactive elements of G from G_bar and free variables
if(active_size == l) return;
int i,j;
int nr_free = 0;
for(j=active_size;j 2*active_size*(l-active_size))
{
for(i=active_size;iget_Q(i,active_size);
for(j=0;jget_Q(i,l);
double alpha_i = alpha[i];
for(j=active_size;jl = l;
this->Q = &Q;
QD=Q.get_QD();
clone(p, p_,l);
clone(y, y_,l);
clone(alpha,alpha_,l);
this->Cp = Cp;
this->Cn = Cn;
this->eps = eps;
unshrink = false;
// initialize alpha_status
{
alpha_status = new char[l];
for(int i=0;iINT_MAX/100 ? INT_MAX : 100*l);
int counter = min(l,1000)+1;
while(iter < max_iter)
{
// show progress and do shrinking
if(--counter == 0)
{
counter = min(l,1000);
if(shrinking) do_shrinking();
info(".");
}
int i,j;
if(select_working_set(i,j)!=0)
{
// reconstruct the whole gradient
reconstruct_gradient();
// reset active set size and check
active_size = l;
info("*");
if(select_working_set(i,j)!=0)
break;
else
counter = 1; // do shrinking next iteration
}
++iter;
// update alpha[i] and alpha[j], handle bounds carefully
const Qfloat *Q_i = Q.get_Q(i,active_size);
const Qfloat *Q_j = Q.get_Q(j,active_size);
double C_i = get_C(i);
double C_j = get_C(j);
double old_alpha_i = alpha[i];
double old_alpha_j = alpha[j];
if(y[i]!=y[j])
{
double quad_coef = QD[i]+QD[j]+2*Q_i[j];
if (quad_coef <= 0)
quad_coef = TAU;
double delta = (-G[i]-G[j])/quad_coef;
double diff = alpha[i] - alpha[j];
alpha[i] += delta;
alpha[j] += delta;
if(diff > 0)
{
if(alpha[j] < 0)
{
alpha[j] = 0;
alpha[i] = diff;
}
}
else
{
if(alpha[i] < 0)
{
alpha[i] = 0;
alpha[j] = -diff;
}
}
if(diff > C_i - C_j)
{
if(alpha[i] > C_i)
{
alpha[i] = C_i;
alpha[j] = C_i - diff;
}
}
else
{
if(alpha[j] > C_j)
{
alpha[j] = C_j;
alpha[i] = C_j + diff;
}
}
}
else
{
double quad_coef = QD[i]+QD[j]-2*Q_i[j];
if (quad_coef <= 0)
quad_coef = TAU;
double delta = (G[i]-G[j])/quad_coef;
double sum = alpha[i] + alpha[j];
alpha[i] -= delta;
alpha[j] += delta;
if(sum > C_i)
{
if(alpha[i] > C_i)
{
alpha[i] = C_i;
alpha[j] = sum - C_i;
}
}
else
{
if(alpha[j] < 0)
{
alpha[j] = 0;
alpha[i] = sum;
}
}
if(sum > C_j)
{
if(alpha[j] > C_j)
{
alpha[j] = C_j;
alpha[i] = sum - C_j;
}
}
else
{
if(alpha[i] < 0)
{
alpha[i] = 0;
alpha[j] = sum;
}
}
}
// update G
double delta_alpha_i = alpha[i] - old_alpha_i;
double delta_alpha_j = alpha[j] - old_alpha_j;
for(int k=0;k= max_iter)
{
if(active_size < l)
{
// reconstruct the whole gradient to calculate objective value
reconstruct_gradient();
active_size = l;
info("*");
}
fprintf(stderr,"\nWARNING: reaching max number of iterations\n");
}
// calculate rho
si->rho = calculate_rho();
// calculate objective value
{
double v = 0;
int i;
for(i=0;iobj = v/2;
}
// put back the solution
{
for(int i=0;iupper_bound_p = Cp;
si->upper_bound_n = Cn;
info("\noptimization finished, #iter = %d\n",iter);
delete[] p;
delete[] y;
delete[] alpha;
delete[] alpha_status;
delete[] active_set;
delete[] G;
delete[] G_bar;
}
// return 1 if already optimal, return 0 otherwise
int Solver::select_working_set(int &out_i, int &out_j)
{
// return i,j such that
// i: maximizes -y_i * grad(f)_i, i in I_up(\alpha)
// j: minimizes the decrease of obj value
// (if quadratic coefficeint <= 0, replace it with tau)
// -y_j*grad(f)_j < -y_i*grad(f)_i, j in I_low(\alpha)
double Gmax = -INF;
double Gmax2 = -INF;
int Gmax_idx = -1;
int Gmin_idx = -1;
double obj_diff_min = INF;
for(int t=0;t= Gmax)
{
Gmax = -G[t];
Gmax_idx = t;
}
}
else
{
if(!is_lower_bound(t))
if(G[t] >= Gmax)
{
Gmax = G[t];
Gmax_idx = t;
}
}
int i = Gmax_idx;
const Qfloat *Q_i = NULL;
if(i != -1) // NULL Q_i not accessed: Gmax=-INF if i=-1
Q_i = Q->get_Q(i,active_size);
for(int j=0;j= Gmax2)
Gmax2 = G[j];
if (grad_diff > 0)
{
double obj_diff;
double quad_coef = QD[i]+QD[j]-2.0*y[i]*Q_i[j];
if (quad_coef > 0)
obj_diff = -(grad_diff*grad_diff)/quad_coef;
else
obj_diff = -(grad_diff*grad_diff)/TAU;
if (obj_diff <= obj_diff_min)
{
Gmin_idx=j;
obj_diff_min = obj_diff;
}
}
}
}
else
{
if (!is_upper_bound(j))
{
double grad_diff= Gmax-G[j];
if (-G[j] >= Gmax2)
Gmax2 = -G[j];
if (grad_diff > 0)
{
double obj_diff;
double quad_coef = QD[i]+QD[j]+2.0*y[i]*Q_i[j];
if (quad_coef > 0)
obj_diff = -(grad_diff*grad_diff)/quad_coef;
else
obj_diff = -(grad_diff*grad_diff)/TAU;
if (obj_diff <= obj_diff_min)
{
Gmin_idx=j;
obj_diff_min = obj_diff;
}
}
}
}
}
if(Gmax+Gmax2 < eps)
return 1;
out_i = Gmax_idx;
out_j = Gmin_idx;
return 0;
}
bool Solver::be_shrunk(int i, double Gmax1, double Gmax2)
{
if(is_upper_bound(i))
{
if(y[i]==+1)
return(-G[i] > Gmax1);
else
return(-G[i] > Gmax2);
}
else if(is_lower_bound(i))
{
if(y[i]==+1)
return(G[i] > Gmax2);
else
return(G[i] > Gmax1);
}
else
return(false);
}
void Solver::do_shrinking()
{
int i;
double Gmax1 = -INF; // max { -y_i * grad(f)_i | i in I_up(\alpha) }
double Gmax2 = -INF; // max { y_i * grad(f)_i | i in I_low(\alpha) }
// find maximal violating pair first
for(i=0;i= Gmax1)
Gmax1 = -G[i];
}
if(!is_lower_bound(i))
{
if(G[i] >= Gmax2)
Gmax2 = G[i];
}
}
else
{
if(!is_upper_bound(i))
{
if(-G[i] >= Gmax2)
Gmax2 = -G[i];
}
if(!is_lower_bound(i))
{
if(G[i] >= Gmax1)
Gmax1 = G[i];
}
}
}
if(unshrink == false && Gmax1 + Gmax2 <= eps*10)
{
unshrink = true;
reconstruct_gradient();
active_size = l;
info("*");
}
for(i=0;i i)
{
if (!be_shrunk(active_size, Gmax1, Gmax2))
{
swap_index(i,active_size);
break;
}
active_size--;
}
}
}
double Solver::calculate_rho()
{
double r;
int nr_free = 0;
double ub = INF, lb = -INF, sum_free = 0;
for(int i=0;i0)
r = sum_free/nr_free;
else
r = (ub+lb)/2;
return r;
}
//
// Solver for nu-svm classification and regression
//
// additional constraint: e^T \alpha = constant
//
class Solver_NU: public Solver
{
public:
Solver_NU() {}
void Solve(int l, const QMatrix& Q, const double *p, const schar *y,
double *alpha, double Cp, double Cn, double eps,
SolutionInfo* si, int shrinking)
{
this->si = si;
Solver::Solve(l,Q,p,y,alpha,Cp,Cn,eps,si,shrinking);
}
private:
SolutionInfo *si;
int select_working_set(int &i, int &j);
double calculate_rho();
bool be_shrunk(int i, double Gmax1, double Gmax2, double Gmax3, double Gmax4);
void do_shrinking();
};
// return 1 if already optimal, return 0 otherwise
int Solver_NU::select_working_set(int &out_i, int &out_j)
{
// return i,j such that y_i = y_j and
// i: maximizes -y_i * grad(f)_i, i in I_up(\alpha)
// j: minimizes the decrease of obj value
// (if quadratic coefficeint <= 0, replace it with tau)
// -y_j*grad(f)_j < -y_i*grad(f)_i, j in I_low(\alpha)
double Gmaxp = -INF;
double Gmaxp2 = -INF;
int Gmaxp_idx = -1;
double Gmaxn = -INF;
double Gmaxn2 = -INF;
int Gmaxn_idx = -1;
int Gmin_idx = -1;
double obj_diff_min = INF;
for(int t=0;t= Gmaxp)
{
Gmaxp = -G[t];
Gmaxp_idx = t;
}
}
else
{
if(!is_lower_bound(t))
if(G[t] >= Gmaxn)
{
Gmaxn = G[t];
Gmaxn_idx = t;
}
}
int ip = Gmaxp_idx;
int in = Gmaxn_idx;
const Qfloat *Q_ip = NULL;
const Qfloat *Q_in = NULL;
if(ip != -1) // NULL Q_ip not accessed: Gmaxp=-INF if ip=-1
Q_ip = Q->get_Q(ip,active_size);
if(in != -1)
Q_in = Q->get_Q(in,active_size);
for(int j=0;j= Gmaxp2)
Gmaxp2 = G[j];
if (grad_diff > 0)
{
double obj_diff;
double quad_coef = QD[ip]+QD[j]-2*Q_ip[j];
if (quad_coef > 0)
obj_diff = -(grad_diff*grad_diff)/quad_coef;
else
obj_diff = -(grad_diff*grad_diff)/TAU;
if (obj_diff <= obj_diff_min)
{
Gmin_idx=j;
obj_diff_min = obj_diff;
}
}
}
}
else
{
if (!is_upper_bound(j))
{
double grad_diff=Gmaxn-G[j];
if (-G[j] >= Gmaxn2)
Gmaxn2 = -G[j];
if (grad_diff > 0)
{
double obj_diff;
double quad_coef = QD[in]+QD[j]-2*Q_in[j];
if (quad_coef > 0)
obj_diff = -(grad_diff*grad_diff)/quad_coef;
else
obj_diff = -(grad_diff*grad_diff)/TAU;
if (obj_diff <= obj_diff_min)
{
Gmin_idx=j;
obj_diff_min = obj_diff;
}
}
}
}
}
if(max(Gmaxp+Gmaxp2,Gmaxn+Gmaxn2) < eps)
return 1;
if (y[Gmin_idx] == +1)
out_i = Gmaxp_idx;
else
out_i = Gmaxn_idx;
out_j = Gmin_idx;
return 0;
}
bool Solver_NU::be_shrunk(int i, double Gmax1, double Gmax2, double Gmax3, double Gmax4)
{
if(is_upper_bound(i))
{
if(y[i]==+1)
return(-G[i] > Gmax1);
else
return(-G[i] > Gmax4);
}
else if(is_lower_bound(i))
{
if(y[i]==+1)
return(G[i] > Gmax2);
else
return(G[i] > Gmax3);
}
else
return(false);
}
void Solver_NU::do_shrinking()
{
double Gmax1 = -INF; // max { -y_i * grad(f)_i | y_i = +1, i in I_up(\alpha) }
double Gmax2 = -INF; // max { y_i * grad(f)_i | y_i = +1, i in I_low(\alpha) }
double Gmax3 = -INF; // max { -y_i * grad(f)_i | y_i = -1, i in I_up(\alpha) }
double Gmax4 = -INF; // max { y_i * grad(f)_i | y_i = -1, i in I_low(\alpha) }
// find maximal violating pair first
int i;
for(i=0;i Gmax1) Gmax1 = -G[i];
}
else if(-G[i] > Gmax4) Gmax4 = -G[i];
}
if(!is_lower_bound(i))
{
if(y[i]==+1)
{
if(G[i] > Gmax2) Gmax2 = G[i];
}
else if(G[i] > Gmax3) Gmax3 = G[i];
}
}
if(unshrink == false && max(Gmax1+Gmax2,Gmax3+Gmax4) <= eps*10)
{
unshrink = true;
reconstruct_gradient();
active_size = l;
}
for(i=0;i i)
{
if (!be_shrunk(active_size, Gmax1, Gmax2, Gmax3, Gmax4))
{
swap_index(i,active_size);
break;
}
active_size--;
}
}
}
double Solver_NU::calculate_rho()
{
int nr_free1 = 0,nr_free2 = 0;
double ub1 = INF, ub2 = INF;
double lb1 = -INF, lb2 = -INF;
double sum_free1 = 0, sum_free2 = 0;
for(int i=0;i 0)
r1 = sum_free1/nr_free1;
else
r1 = (ub1+lb1)/2;
if(nr_free2 > 0)
r2 = sum_free2/nr_free2;
else
r2 = (ub2+lb2)/2;
si->r = (r1+r2)/2;
return (r1-r2)/2;
}
//
// Q matrices for various formulations
//
class SVC_Q: public Kernel
{
public:
SVC_Q(const svm_problem& prob, const svm_parameter& param, const schar *y_)
:Kernel(prob.l, prob.x, param)
{
clone(y,y_,prob.l);
cache = new Cache(prob.l,(long int)(param.cache_size*(1<<20)));
QD = new double[prob.l];
for(int i=0;i*kernel_function)(i,i);
}
Qfloat *get_Q(int i, int len) const
{
Qfloat *data;
int start, j;
if((start = cache->get_data(i,&data,len)) < len)
{
for(j=start;j*kernel_function)(i,j));
}
return data;
}
double *get_QD() const
{
return QD;
}
void swap_index(int i, int j) const
{
cache->swap_index(i,j);
Kernel::swap_index(i,j);
swap(y[i],y[j]);
swap(QD[i],QD[j]);
}
~SVC_Q()
{
delete[] y;
delete cache;
delete[] QD;
}
private:
schar *y;
Cache *cache;
double *QD;
};
class ONE_CLASS_Q: public Kernel
{
public:
ONE_CLASS_Q(const svm_problem& prob, const svm_parameter& param)
:Kernel(prob.l, prob.x, param)
{
cache = new Cache(prob.l,(long int)(param.cache_size*(1<<20)));
QD = new double[prob.l];
for(int i=0;i*kernel_function)(i,i);
}
Qfloat *get_Q(int i, int len) const
{
Qfloat *data;
int start, j;
if((start = cache->get_data(i,&data,len)) < len)
{
for(j=start;j*kernel_function)(i,j);
}
return data;
}
double *get_QD() const
{
return QD;
}
void swap_index(int i, int j) const
{
cache->swap_index(i,j);
Kernel::swap_index(i,j);
swap(QD[i],QD[j]);
}
~ONE_CLASS_Q()
{
delete cache;
delete[] QD;
}
private:
Cache *cache;
double *QD;
};
class SVR_Q: public Kernel
{
public:
SVR_Q(const svm_problem& prob, const svm_parameter& param)
:Kernel(prob.l, prob.x, param)
{
l = prob.l;
cache = new Cache(l,(long int)(param.cache_size*(1<<20)));
QD = new double[2*l];
sign = new schar[2*l];
index = new int[2*l];
for(int k=0;k*kernel_function)(k,k);
QD[k+l] = QD[k];
}
buffer[0] = new Qfloat[2*l];
buffer[1] = new Qfloat[2*l];
next_buffer = 0;
}
void swap_index(int i, int j) const
{
swap(sign[i],sign[j]);
swap(index[i],index[j]);
swap(QD[i],QD[j]);
}
Qfloat *get_Q(int i, int len) const
{
Qfloat *data;
int j, real_i = index[i];
if(cache->get_data(real_i,&data,l) < l)
{
for(j=0;j*kernel_function)(real_i,j);
}
// reorder and copy
Qfloat *buf = buffer[next_buffer];
next_buffer = 1 - next_buffer;
schar si = sign[i];
for(j=0;jl;
double *minus_ones = new double[l];
schar *y = new schar[l];
int i;
for(i=0;iy[i] > 0) y[i] = +1; else y[i] = -1;
}
Solver s;
s.Solve(l, SVC_Q(*prob,*param,y), minus_ones, y,
alpha, Cp, Cn, param->eps, si, param->shrinking);
double sum_alpha=0;
for(i=0;il));
for(i=0;il;
double nu = param->nu;
schar *y = new schar[l];
for(i=0;iy[i]>0)
y[i] = +1;
else
y[i] = -1;
double sum_pos = nu*l/2;
double sum_neg = nu*l/2;
for(i=0;ieps, si, param->shrinking);
double r = si->r;
info("C = %f\n",1/r);
for(i=0;irho /= r;
si->obj /= (r*r);
si->upper_bound_p = 1/r;
si->upper_bound_n = 1/r;
delete[] y;
delete[] zeros;
}
static void solve_one_class(
const svm_problem *prob, const svm_parameter *param,
double *alpha, Solver::SolutionInfo* si)
{
int l = prob->l;
double *zeros = new double[l];
schar *ones = new schar[l];
int i;
int n = (int)(param->nu*prob->l); // # of alpha's at upper bound
for(i=0;il)
alpha[n] = param->nu * prob->l - n;
for(i=n+1;ieps, si, param->shrinking);
delete[] zeros;
delete[] ones;
}
static void solve_epsilon_svr(
const svm_problem *prob, const svm_parameter *param,
double *alpha, Solver::SolutionInfo* si)
{
int l = prob->l;
double *alpha2 = new double[2*l];
double *linear_term = new double[2*l];
schar *y = new schar[2*l];
int i;
for(i=0;ip - prob->y[i];
y[i] = 1;
alpha2[i+l] = 0;
linear_term[i+l] = param->p + prob->y[i];
y[i+l] = -1;
}
Solver s;
s.Solve(2*l, SVR_Q(*prob,*param), linear_term, y,
alpha2, param->C, param->C, param->eps, si, param->shrinking);
double sum_alpha = 0;
for(i=0;iC*l));
delete[] alpha2;
delete[] linear_term;
delete[] y;
}
static void solve_nu_svr(
const svm_problem *prob, const svm_parameter *param,
double *alpha, Solver::SolutionInfo* si)
{
int l = prob->l;
double C = param->C;
double *alpha2 = new double[2*l];
double *linear_term = new double[2*l];
schar *y = new schar[2*l];
int i;
double sum = C * param->nu * l / 2;
for(i=0;iy[i];
y[i] = 1;
linear_term[i+l] = prob->y[i];
y[i+l] = -1;
}
Solver_NU s;
s.Solve(2*l, SVR_Q(*prob,*param), linear_term, y,
alpha2, C, C, param->eps, si, param->shrinking);
info("epsilon = %f\n",-si->r);
for(i=0;il);
Solver::SolutionInfo si;
switch(param->svm_type)
{
case C_SVC:
solve_c_svc(prob,param,alpha,&si,Cp,Cn);
break;
case NU_SVC:
solve_nu_svc(prob,param,alpha,&si);
break;
case ONE_CLASS:
solve_one_class(prob,param,alpha,&si);
break;
case EPSILON_SVR:
solve_epsilon_svr(prob,param,alpha,&si);
break;
case NU_SVR:
solve_nu_svr(prob,param,alpha,&si);
break;
}
info("obj = %f, rho = %f\n",si.obj,si.rho);
// output SVs
int nSV = 0;
int nBSV = 0;
for(int i=0;il;i++)
{
if(fabs(alpha[i]) > 0)
{
++nSV;
if(prob->y[i] > 0)
{
if(fabs(alpha[i]) >= si.upper_bound_p)
++nBSV;
}
else
{
if(fabs(alpha[i]) >= si.upper_bound_n)
++nBSV;
}
}
}
info("nSV = %d, nBSV = %d\n",nSV,nBSV);
decision_function f;
f.alpha = alpha;
f.rho = si.rho;
return f;
}
// Platt's binary SVM Probablistic Output: an improvement from Lin et al.
static void sigmoid_train(
int l, const double *dec_values, const double *labels,
double& A, double& B)
{
double prior1=0, prior0 = 0;
int i;
for (i=0;i 0) prior1+=1;
else prior0+=1;
int max_iter=100; // Maximal number of iterations
double min_step=1e-10; // Minimal step taken in line search
double sigma=1e-12; // For numerically strict PD of Hessian
double eps=1e-5;
double hiTarget=(prior1+1.0)/(prior1+2.0);
double loTarget=1/(prior0+2.0);
double *t=Malloc(double,l);
double fApB,p,q,h11,h22,h21,g1,g2,det,dA,dB,gd,stepsize;
double newA,newB,newf,d1,d2;
int iter;
// Initial Point and Initial Fun Value
A=0.0; B=log((prior0+1.0)/(prior1+1.0));
double fval = 0.0;
for (i=0;i0) t[i]=hiTarget;
else t[i]=loTarget;
fApB = dec_values[i]*A+B;
if (fApB>=0)
fval += t[i]*fApB + log(1+exp(-fApB));
else
fval += (t[i] - 1)*fApB +log(1+exp(fApB));
}
for (iter=0;iter= 0)
{
p=exp(-fApB)/(1.0+exp(-fApB));
q=1.0/(1.0+exp(-fApB));
}
else
{
p=1.0/(1.0+exp(fApB));
q=exp(fApB)/(1.0+exp(fApB));
}
d2=p*q;
h11+=dec_values[i]*dec_values[i]*d2;
h22+=d2;
h21+=dec_values[i]*d2;
d1=t[i]-p;
g1+=dec_values[i]*d1;
g2+=d1;
}
// Stopping Criteria
if (fabs(g1)= min_step)
{
newA = A + stepsize * dA;
newB = B + stepsize * dB;
// New function value
newf = 0.0;
for (i=0;i= 0)
newf += t[i]*fApB + log(1+exp(-fApB));
else
newf += (t[i] - 1)*fApB +log(1+exp(fApB));
}
// Check sufficient decrease
if (newf=max_iter)
info("Reaching maximal iterations in two-class probability estimates\n");
free(t);
}
static double sigmoid_predict(double decision_value, double A, double B)
{
double fApB = decision_value*A+B;
// 1-p used later; avoid catastrophic cancellation
if (fApB >= 0)
return exp(-fApB)/(1.0+exp(-fApB));
else
return 1.0/(1+exp(fApB)) ;
}
// Method 2 from the multiclass_prob paper by Wu, Lin, and Weng
static void multiclass_probability(int k, double **r, double *p)
{
int t,j;
int iter = 0, max_iter=max(100,k);
double **Q=Malloc(double *,k);
double *Qp=Malloc(double,k);
double pQp, eps=0.005/k;
for (t=0;tmax_error)
max_error=error;
}
if (max_error=max_iter)
info("Exceeds max_iter in multiclass_prob\n");
for(t=0;tl);
double *dec_values = Malloc(double,prob->l);
// random shuffle
for(i=0;il;i++) perm[i]=i;
for(i=0;il;i++)
{
int j = i+rand()%(prob->l-i);
swap(perm[i],perm[j]);
}
for(i=0;il/nr_fold;
int end = (i+1)*prob->l/nr_fold;
int j,k;
struct svm_problem subprob;
subprob.l = prob->l-(end-begin);
#ifdef _DENSE_REP
subprob.x = Malloc(struct svm_node,subprob.l);
#else
subprob.x = Malloc(struct svm_node*,subprob.l);
#endif
subprob.y = Malloc(double,subprob.l);
k=0;
for(j=0;jx[perm[j]];
subprob.y[k] = prob->y[perm[j]];
++k;
}
for(j=end;jl;j++)
{
subprob.x[k] = prob->x[perm[j]];
subprob.y[k] = prob->y[perm[j]];
++k;
}
int p_count=0,n_count=0;
for(j=0;j0)
p_count++;
else
n_count++;
if(p_count==0 && n_count==0)
for(j=begin;j 0 && n_count == 0)
for(j=begin;j 0)
for(j=begin;jx+perm[j]),&(dec_values[perm[j]]));
#else
svm_predict_values(submodel,prob->x[perm[j]],&(dec_values[perm[j]]));
#endif
// ensure +1 -1 order; reason not using CV subroutine
dec_values[perm[j]] *= submodel->label[0];
}
svm_free_and_destroy_model(&submodel);
svm_destroy_param(&subparam);
}
free(subprob.x);
free(subprob.y);
}
sigmoid_train(prob->l,dec_values,prob->y,probA,probB);
free(dec_values);
free(perm);
}
// Return parameter of a Laplace distribution
static double svm_svr_probability(
const svm_problem *prob, const svm_parameter *param)
{
int i;
int nr_fold = 5;
double *ymv = Malloc(double,prob->l);
double mae = 0;
svm_parameter newparam = *param;
newparam.probability = 0;
svm_cross_validation(prob,&newparam,nr_fold,ymv);
for(i=0;il;i++)
{
ymv[i]=prob->y[i]-ymv[i];
mae += fabs(ymv[i]);
}
mae /= prob->l;
double std=sqrt(2*mae*mae);
int count=0;
mae=0;
for(i=0;il;i++)
if (fabs(ymv[i]) > 5*std)
count=count+1;
else
mae+=fabs(ymv[i]);
mae /= (prob->l-count);
info("Prob. model for test data: target value = predicted value + z,\nz: Laplace distribution e^(-|z|/sigma)/(2sigma),sigma= %g\n",mae);
free(ymv);
return mae;
}
// label: label name, start: begin of each class, count: #data of classes, perm: indices to the original data
// perm, length l, must be allocated before calling this subroutine
static void svm_group_classes(const svm_problem *prob, int *nr_class_ret, int **label_ret, int **start_ret, int **count_ret, int *perm)
{
int l = prob->l;
int max_nr_class = 16;
int nr_class = 0;
int *label = Malloc(int,max_nr_class);
int *count = Malloc(int,max_nr_class);
int *data_label = Malloc(int,l);
int i;
for(i=0;iy[i];
int j;
for(j=0;jparam = *param;
model->free_sv = 0; // XXX
if(param->svm_type == ONE_CLASS ||
param->svm_type == EPSILON_SVR ||
param->svm_type == NU_SVR)
{
// regression or one-class-svm
model->nr_class = 2;
model->label = NULL;
model->nSV = NULL;
model->probA = NULL; model->probB = NULL;
model->sv_coef = Malloc(double *,1);
if(param->probability &&
(param->svm_type == EPSILON_SVR ||
param->svm_type == NU_SVR))
{
model->probA = Malloc(double,1);
model->probA[0] = svm_svr_probability(prob,param);
}
decision_function f = svm_train_one(prob,param,0,0);
model->rho = Malloc(double,1);
model->rho[0] = f.rho;
int nSV = 0;
int i;
for(i=0;il;i++)
if(fabs(f.alpha[i]) > 0) ++nSV;
model->l = nSV;
#ifdef _DENSE_REP
model->SV = Malloc(svm_node,nSV);
#else
model->SV = Malloc(svm_node *,nSV);
#endif
model->sv_coef[0] = Malloc(double,nSV);
model->sv_indices = Malloc(int,nSV);
int j = 0;
for(i=0;il;i++)
if(fabs(f.alpha[i]) > 0)
{
model->SV[j] = prob->x[i];
model->sv_coef[0][j] = f.alpha[i];
model->sv_indices[j] = i+1;
++j;
}
free(f.alpha);
}
else
{
// classification
int l = prob->l;
int nr_class;
int *label = NULL;
int *start = NULL;
int *count = NULL;
int *perm = Malloc(int,l);
// group training data of the same class
svm_group_classes(prob,&nr_class,&label,&start,&count,perm);
#ifdef _DENSE_REP
svm_node *x = Malloc(svm_node,l);
#else
svm_node **x = Malloc(svm_node *,l);
#endif
int i;
for(i=0;ix[perm[i]];
// calculate weighted C
double *weighted_C = Malloc(double, nr_class);
for(i=0;iC;
for(i=0;inr_weight;i++)
{
int j;
for(j=0;jweight_label[i] == label[j])
break;
if(j == nr_class)
fprintf(stderr,"warning: class label %d specified in weight is not found\n", param->weight_label[i]);
else
weighted_C[j] *= param->weight[i];
}
// train k*(k-1)/2 models
bool *nonzero = Malloc(bool,l);
for(i=0;iprobability)
{
probA=Malloc(double,nr_class*(nr_class-1)/2);
probB=Malloc(double,nr_class*(nr_class-1)/2);
}
int p = 0;
for(i=0;iprobability)
svm_binary_svc_probability(&sub_prob,param,weighted_C[i],weighted_C[j],probA[p],probB[p]);
f[p] = svm_train_one(&sub_prob,param,weighted_C[i],weighted_C[j]);
for(k=0;k 0)
nonzero[si+k] = true;
for(k=0;k 0)
nonzero[sj+k] = true;
free(sub_prob.x);
free(sub_prob.y);
++p;
}
// build output
model->nr_class = nr_class;
model->label = Malloc(int,nr_class);
for(i=0;ilabel[i] = label[i];
model->rho = Malloc(double,nr_class*(nr_class-1)/2);
for(i=0;irho[i] = f[i].rho;
if(param->probability)
{
model->probA = Malloc(double,nr_class*(nr_class-1)/2);
model->probB = Malloc(double,nr_class*(nr_class-1)/2);
for(i=0;iprobA[i] = probA[i];
model->probB[i] = probB[i];
}
}
else
{
model->probA=NULL;
model->probB=NULL;
}
int total_sv = 0;
int *nz_count = Malloc(int,nr_class);
model->nSV = Malloc(int,nr_class);
for(i=0;inSV[i] = nSV;
nz_count[i] = nSV;
}
info("Total nSV = %d\n",total_sv);
model->l = total_sv;
#ifdef _DENSE_REP
model->SV = Malloc(svm_node,total_sv);
#else
model->SV = Malloc(svm_node *,total_sv);
#endif
model->sv_indices = Malloc(int,total_sv);
p = 0;
for(i=0;iSV[p] = x[i];
model->sv_indices[p++] = perm[i] + 1;
}
int *nz_start = Malloc(int,nr_class);
nz_start[0] = 0;
for(i=1;isv_coef = Malloc(double *,nr_class-1);
for(i=0;isv_coef[i] = Malloc(double,total_sv);
p = 0;
for(i=0;isv_coef[j-1][q++] = f[p].alpha[k];
q = nz_start[j];
for(k=0;ksv_coef[i][q++] = f[p].alpha[ci+k];
++p;
}
free(label);
free(probA);
free(probB);
free(count);
free(perm);
free(start);
free(x);
free(weighted_C);
free(nonzero);
for(i=0;il;
int *perm = Malloc(int,l);
int nr_class;
if (nr_fold > l)
{
nr_fold = l;
fprintf(stderr,"WARNING: # folds > # data. Will use # folds = # data instead (i.e., leave-one-out cross validation)\n");
}
fold_start = Malloc(int,nr_fold+1);
// stratified cv may not give leave-one-out rate
// Each class to l folds -> some folds may have zero elements
if((param->svm_type == C_SVC ||
param->svm_type == NU_SVC) && nr_fold < l)
{
int *start = NULL;
int *label = NULL;
int *count = NULL;
svm_group_classes(prob,&nr_class,&label,&start,&count,perm);
// random shuffle and then data grouped by fold using the array perm
int *fold_count = Malloc(int,nr_fold);
int c;
int *index = Malloc(int,l);
for(i=0;ix[perm[j]];
subprob.y[k] = prob->y[perm[j]];
++k;
}
for(j=end;jx[perm[j]];
subprob.y[k] = prob->y[perm[j]];
++k;
}
struct svm_model *submodel = svm_train(&subprob,param);
if(param->probability &&
(param->svm_type == C_SVC || param->svm_type == NU_SVC))
{
double *prob_estimates=Malloc(double,svm_get_nr_class(submodel));
for(j=begin;jx + perm[j]),prob_estimates);
#else
target[perm[j]] = svm_predict_probability(submodel,prob->x[perm[j]],prob_estimates);
#endif
free(prob_estimates);
}
else
for(j=begin;jx+perm[j]);
#else
target[perm[j]] = svm_predict(submodel,prob->x[perm[j]]);
#endif
svm_free_and_destroy_model(&submodel);
free(subprob.x);
free(subprob.y);
}
free(fold_start);
free(perm);
}
int svm_get_svm_type(const svm_model *model)
{
return model->param.svm_type;
}
int svm_get_nr_class(const svm_model *model)
{
return model->nr_class;
}
void svm_get_labels(const svm_model *model, int* label)
{
if (model->label != NULL)
for(int i=0;inr_class;i++)
label[i] = model->label[i];
}
void svm_get_sv_indices(const svm_model *model, int* indices)
{
if (model->sv_indices != NULL)
for(int i=0;il;i++)
indices[i] = model->sv_indices[i];
}
int svm_get_nr_sv(const svm_model *model)
{
return model->l;
}
double svm_get_svr_probability(const svm_model *model)
{
if ((model->param.svm_type == EPSILON_SVR || model->param.svm_type == NU_SVR) &&
model->probA!=NULL)
return model->probA[0];
else
{
fprintf(stderr,"Model doesn't contain information for SVR probability inference\n");
return 0;
}
}
double svm_predict_values(const svm_model *model, const svm_node *x, double* dec_values)
{
int i;
if(model->param.svm_type == ONE_CLASS ||
model->param.svm_type == EPSILON_SVR ||
model->param.svm_type == NU_SVR)
{
double *sv_coef = model->sv_coef[0];
double sum = 0;
for(i=0;il;i++)
#ifdef _DENSE_REP
sum += sv_coef[i] * Kernel::k_function(x,model->SV+i,model->param);
#else
sum += sv_coef[i] * Kernel::k_function(x,model->SV[i],model->param);
#endif
sum -= model->rho[0];
*dec_values = sum;
if(model->param.svm_type == ONE_CLASS)
return (sum>0)?1:-1;
else
return sum;
}
else
{
int i;
int nr_class = model->nr_class;
int l = model->l;
double *kvalue = Malloc(double,l);
for(i=0;iSV+i,model->param);
#else
kvalue[i] = Kernel::k_function(x,model->SV[i],model->param);
#endif
int *start = Malloc(int,nr_class);
start[0] = 0;
for(i=1;inSV[i-1];
int *vote = Malloc(int,nr_class);
for(i=0;inSV[i];
int cj = model->nSV[j];
int k;
double *coef1 = model->sv_coef[j-1];
double *coef2 = model->sv_coef[i];
for(k=0;krho[p];
dec_values[p] = sum;
if(dec_values[p] > 0)
++vote[i];
else
++vote[j];
p++;
}
int vote_max_idx = 0;
for(i=1;i vote[vote_max_idx])
vote_max_idx = i;
free(kvalue);
free(start);
free(vote);
return model->label[vote_max_idx];
}
}
double svm_predict(const svm_model *model, const svm_node *x)
{
int nr_class = model->nr_class;
double *dec_values;
if(model->param.svm_type == ONE_CLASS ||
model->param.svm_type == EPSILON_SVR ||
model->param.svm_type == NU_SVR)
dec_values = Malloc(double, 1);
else
dec_values = Malloc(double, nr_class*(nr_class-1)/2);
double pred_result = svm_predict_values(model, x, dec_values);
free(dec_values);
return pred_result;
}
double svm_predict_probability(
const svm_model *model, const svm_node *x, double *prob_estimates)
{
if ((model->param.svm_type == C_SVC || model->param.svm_type == NU_SVC) &&
model->probA!=NULL && model->probB!=NULL)
{
int i;
int nr_class = model->nr_class;
double *dec_values = Malloc(double, nr_class*(nr_class-1)/2);
svm_predict_values(model, x, dec_values);
double min_prob=1e-7;
double **pairwise_prob=Malloc(double *,nr_class);
for(i=0;iprobA[k],model->probB[k]),min_prob),1-min_prob);
pairwise_prob[j][i]=1-pairwise_prob[i][j];
k++;
}
multiclass_probability(nr_class,pairwise_prob,prob_estimates);
int prob_max_idx = 0;
for(i=1;i prob_estimates[prob_max_idx])
prob_max_idx = i;
for(i=0;ilabel[prob_max_idx];
}
else
return svm_predict(model, x);
}
static const char *svm_type_table[] =
{
"c_svc","nu_svc","one_class","epsilon_svr","nu_svr",NULL
};
static const char *kernel_type_table[]=
{
"linear","polynomial","rbf","sigmoid","precomputed",NULL
};
int svm_save_model(const char *model_file_name, const svm_model *model)
{
FILE *fp = fopen(model_file_name,"w");
if(fp==NULL) return -1;
char *old_locale = strdup(setlocale(LC_ALL, NULL));
setlocale(LC_ALL, "C");
const svm_parameter& param = model->param;
fprintf(fp,"svm_type %s\n", svm_type_table[param.svm_type]);
fprintf(fp,"kernel_type %s\n", kernel_type_table[param.kernel_type]);
if(param.kernel_type == POLY)
fprintf(fp,"degree %d\n", param.degree);
if(param.kernel_type == POLY || param.kernel_type == RBF || param.kernel_type == SIGMOID)
fprintf(fp,"gamma %g\n", param.gamma);
if(param.kernel_type == POLY || param.kernel_type == SIGMOID)
fprintf(fp,"coef0 %g\n", param.coef0);
int nr_class = model->nr_class;
int l = model->l;
fprintf(fp, "nr_class %d\n", nr_class);
fprintf(fp, "total_sv %d\n",l);
{
fprintf(fp, "rho");
for(int i=0;irho[i]);
fprintf(fp, "\n");
}
if(model->label)
{
fprintf(fp, "label");
for(int i=0;ilabel[i]);
fprintf(fp, "\n");
}
if(model->probA) // regression has probA only
{
fprintf(fp, "probA");
for(int i=0;iprobA[i]);
fprintf(fp, "\n");
}
if(model->probB)
{
fprintf(fp, "probB");
for(int i=0;iprobB[i]);
fprintf(fp, "\n");
}
if(model->nSV)
{
fprintf(fp, "nr_sv");
for(int i=0;inSV[i]);
fprintf(fp, "\n");
}
fprintf(fp, "SV\n");
const double * const *sv_coef = model->sv_coef;
#ifdef _DENSE_REP
const svm_node *SV = model->SV;
#else
const svm_node * const *SV = model->SV;
#endif
for(int i=0;ivalues[0]));
else
for (int j = 0; j < p->dim; j++)
if (p->values[j] != 0.0)
fprintf(fp,"%d:%.8g ",j, p->values[j]);
#else
const svm_node *p = SV[i];
if(param.kernel_type == PRECOMPUTED)
fprintf(fp,"0:%d ",(int)(p->value));
else
while(p->index != -1)
{
fprintf(fp,"%d:%.8g ",p->index,p->value);
p++;
}
#endif
fprintf(fp, "\n");
}
setlocale(LC_ALL, old_locale);
free(old_locale);
if (ferror(fp) != 0 || fclose(fp) != 0) return -1;
else return 0;
}
static char *line = NULL;
static int max_line_len;
static char* readline(FILE *input)
{
int len;
if(fgets(line,max_line_len,input) == NULL)
return NULL;
while(strrchr(line,'\n') == NULL)
{
max_line_len *= 2;
line = (char *) realloc(line,max_line_len);
len = (int) strlen(line);
if(fgets(line+len,max_line_len-len,input) == NULL)
break;
}
return line;
}
svm_model *svm_load_model(const char *model_file_name)
{
FILE *fp = fopen(model_file_name,"rb");
if(fp==NULL) return NULL;
char *old_locale = strdup(setlocale(LC_ALL, NULL));
setlocale(LC_ALL, "C");
// read parameters
svm_model *model = Malloc(svm_model,1);
svm_parameter& param = model->param;
model->rho = NULL;
model->probA = NULL;
model->probB = NULL;
model->sv_indices = NULL;
model->label = NULL;
model->nSV = NULL;
char cmd[81];
while(1)
{
fscanf(fp,"%80s",cmd);
if(strcmp(cmd,"svm_type")==0)
{
fscanf(fp,"%80s",cmd);
int i;
for(i=0;svm_type_table[i];i++)
{
if(strcmp(svm_type_table[i],cmd)==0)
{
param.svm_type=i;
break;
}
}
if(svm_type_table[i] == NULL)
{
fprintf(stderr,"unknown svm type.\n");
setlocale(LC_ALL, old_locale);
free(old_locale);
free(model->rho);
free(model->label);
free(model->nSV);
free(model);
return NULL;
}
}
else if(strcmp(cmd,"kernel_type")==0)
{
fscanf(fp,"%80s",cmd);
int i;
for(i=0;kernel_type_table[i];i++)
{
if(strcmp(kernel_type_table[i],cmd)==0)
{
param.kernel_type=i;
break;
}
}
if(kernel_type_table[i] == NULL)
{
fprintf(stderr,"unknown kernel function.\n");
setlocale(LC_ALL, old_locale);
free(old_locale);
free(model->rho);
free(model->label);
free(model->nSV);
free(model);
return NULL;
}
}
else if(strcmp(cmd,"degree")==0)
fscanf(fp,"%d",¶m.degree);
else if(strcmp(cmd,"gamma")==0)
fscanf(fp,"%lf",¶m.gamma);
else if(strcmp(cmd,"coef0")==0)
fscanf(fp,"%lf",¶m.coef0);
else if(strcmp(cmd,"nr_class")==0)
fscanf(fp,"%d",&model->nr_class);
else if(strcmp(cmd,"total_sv")==0)
fscanf(fp,"%d",&model->l);
else if(strcmp(cmd,"rho")==0)
{
int n = model->nr_class * (model->nr_class-1)/2;
model->rho = Malloc(double,n);
for(int i=0;irho[i]);
}
else if(strcmp(cmd,"label")==0)
{
int n = model->nr_class;
model->label = Malloc(int,n);
for(int i=0;ilabel[i]);
}
else if(strcmp(cmd,"probA")==0)
{
int n = model->nr_class * (model->nr_class-1)/2;
model->probA = Malloc(double,n);
for(int i=0;iprobA[i]);
}
else if(strcmp(cmd,"probB")==0)
{
int n = model->nr_class * (model->nr_class-1)/2;
model->probB = Malloc(double,n);
for(int i=0;iprobB[i]);
}
else if(strcmp(cmd,"nr_sv")==0)
{
int n = model->nr_class;
model->nSV = Malloc(int,n);
for(int i=0;inSV[i]);
}
else if(strcmp(cmd,"SV")==0)
{
while(1)
{
int c = getc(fp);
if(c==EOF || c=='\n') break;
}
break;
}
else
{
fprintf(stderr,"unknown text in model file: [%s]\n",cmd);
setlocale(LC_ALL, old_locale);
free(old_locale);
free(model->rho);
free(model->label);
free(model->nSV);
free(model);
return NULL;
}
}
// read sv_coef and SV
int elements = 0;
long pos = ftell(fp);
max_line_len = 1024;
line = Malloc(char,max_line_len);
char *p,*endptr,*idx,*val;
#ifdef _DENSE_REP
int max_index = 1;
// read the max dimension of all vectors
while(readline(fp) != NULL)
{
char *p;
p = strrchr(line, ':');
if(p != NULL)
{
while(*p != ' ' && *p != '\t' && p > line)
p--;
if(p > line)
max_index = (int) strtol(p,&endptr,10) + 1;
}
if(max_index > elements)
elements = max_index;
}
#else
while(readline(fp)!=NULL)
{
p = strtok(line,":");
while(1)
{
p = strtok(NULL,":");
if(p == NULL)
break;
++elements;
}
}
elements += model->l;
#endif
fseek(fp,pos,SEEK_SET);
int m = model->nr_class - 1;
int l = model->l;
model->sv_coef = Malloc(double *,m);
int i;
for(i=0;isv_coef[i] = Malloc(double,l);
#ifdef _DENSE_REP
int index;
model->SV = Malloc(svm_node,l);
for(i=0;iSV[i].values = Malloc(double, elements);
model->SV[i].dim = 0;
p = strtok(line, " \t");
model->sv_coef[0][i] = strtod(p,&endptr);
for(int k=1;ksv_coef[k][i] = strtod(p,&endptr);
}
int *d = &(model->SV[i].dim);
while(1)
{
idx = strtok(NULL, ":");
val = strtok(NULL, " \t");
if(val == NULL)
break;
index = (int) strtol(idx,&endptr,10);
while (*d < index)
model->SV[i].values[(*d)++] = 0.0;
model->SV[i].values[(*d)++] = strtod(val,&endptr);
}
}
#else
model->SV = Malloc(svm_node*,l);
svm_node *x_space = NULL;
if(l>0) x_space = Malloc(svm_node,elements);
int j=0;
for(i=0;iSV[i] = &x_space[j];
p = strtok(line, " \t");
model->sv_coef[0][i] = strtod(p,&endptr);
for(int k=1;ksv_coef[k][i] = strtod(p,&endptr);
}
while(1)
{
idx = strtok(NULL, ":");
val = strtok(NULL, " \t");
if(val == NULL)
break;
x_space[j].index = (int) strtol(idx,&endptr,10);
x_space[j].value = strtod(val,&endptr);
++j;
}
x_space[j++].index = -1;
}
#endif
free(line);
setlocale(LC_ALL, old_locale);
free(old_locale);
if (ferror(fp) != 0 || fclose(fp) != 0)
return NULL;
model->free_sv = 1; // XXX
return model;
}
void svm_free_model_content(svm_model* model_ptr)
{
if(model_ptr->free_sv && model_ptr->l > 0 && model_ptr->SV != NULL)
#ifdef _DENSE_REP
for (int i = 0; i < model_ptr->l; i++)
free (model_ptr->SV[i].values);
#else
free((void *)(model_ptr->SV[0]));
#endif
if(model_ptr->sv_coef)
{
for(int i=0;inr_class-1;i++)
free(model_ptr->sv_coef[i]);
}
free(model_ptr->SV);
model_ptr->SV = NULL;
free(model_ptr->sv_coef);
model_ptr->sv_coef = NULL;
free(model_ptr->rho);
model_ptr->rho = NULL;
free(model_ptr->label);
model_ptr->label= NULL;
free(model_ptr->probA);
model_ptr->probA = NULL;
free(model_ptr->probB);
model_ptr->probB= NULL;
free(model_ptr->sv_indices);
model_ptr->sv_indices = NULL;
free(model_ptr->nSV);
model_ptr->nSV = NULL;
}
void svm_free_and_destroy_model(svm_model** model_ptr_ptr)
{
if(model_ptr_ptr != NULL && *model_ptr_ptr != NULL)
{
svm_free_model_content(*model_ptr_ptr);
free(*model_ptr_ptr);
*model_ptr_ptr = NULL;
}
}
void svm_destroy_param(svm_parameter* param)
{
free(param->weight_label);
free(param->weight);
}
const char *svm_check_parameter(const svm_problem *prob, const svm_parameter *param)
{
// svm_type
int svm_type = param->svm_type;
if(svm_type != C_SVC &&
svm_type != NU_SVC &&
svm_type != ONE_CLASS &&
svm_type != EPSILON_SVR &&
svm_type != NU_SVR)
return "unknown svm type";
// kernel_type, degree
int kernel_type = param->kernel_type;
if(kernel_type != LINEAR &&
kernel_type != POLY &&
kernel_type != RBF &&
kernel_type != SIGMOID &&
kernel_type != PRECOMPUTED)
return "unknown kernel type";
if(param->gamma < 0)
return "gamma < 0";
if(param->degree < 0)
return "degree of polynomial kernel < 0";
// cache_size,eps,C,nu,p,shrinking
if(param->cache_size <= 0)
return "cache_size <= 0";
if(param->eps <= 0)
return "eps <= 0";
if(svm_type == C_SVC ||
svm_type == EPSILON_SVR ||
svm_type == NU_SVR)
if(param->C <= 0)
return "C <= 0";
if(svm_type == NU_SVC ||
svm_type == ONE_CLASS ||
svm_type == NU_SVR)
if(param->nu <= 0 || param->nu > 1)
return "nu <= 0 or nu > 1";
if(svm_type == EPSILON_SVR)
if(param->p < 0)
return "p < 0";
if(param->shrinking != 0 &&
param->shrinking != 1)
return "shrinking != 0 and shrinking != 1";
if(param->probability != 0 &&
param->probability != 1)
return "probability != 0 and probability != 1";
if(param->probability == 1 &&
svm_type == ONE_CLASS)
return "one-class SVM probability output not supported yet";
// check whether nu-svc is feasible
if(svm_type == NU_SVC)
{
int l = prob->l;
int max_nr_class = 16;
int nr_class = 0;
int *label = Malloc(int,max_nr_class);
int *count = Malloc(int,max_nr_class);
int i;
for(i=0;iy[i];
int j;
for(j=0;jnu*(n1+n2)/2 > min(n1,n2))
{
free(label);
free(count);
return "specified nu is infeasible";
}
}
}
free(label);
free(count);
}
return NULL;
}
int svm_check_probability_model(const svm_model *model)
{
return ((model->param.svm_type == C_SVC || model->param.svm_type == NU_SVC) &&
model->probA!=NULL && model->probB!=NULL) ||
((model->param.svm_type == EPSILON_SVR || model->param.svm_type == NU_SVR) &&
model->probA!=NULL);
}
void svm_set_print_string_function(void (*print_func)(const char *))
{
if(print_func == NULL)
svm_print_string = &print_string_stdout;
else
svm_print_string = print_func;
}
================================================
FILE: src/linux/svm.h
================================================
#ifndef _LIBSVM_H
#define _LIBSVM_H
#define _DENSE_REP
#define LIBSVM_VERSION 317
#ifdef __cplusplus
extern "C" {
#endif
extern int libsvm_version;
#ifdef _DENSE_REP
struct svm_node
{
int dim;
double *values;
};
struct svm_problem
{
int l;
double *y;
struct svm_node *x;
};
#else
struct svm_node
{
int index;
double value;
};
struct svm_problem
{
int l;
double *y;
struct svm_node **x;
};
#endif
enum { C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR }; /* svm_type */
enum { LINEAR, POLY, RBF, SIGMOID, PRECOMPUTED }; /* kernel_type */
struct svm_parameter
{
int svm_type;
int kernel_type;
int degree; /* for poly */
double gamma; /* for poly/rbf/sigmoid */
double coef0; /* for poly/sigmoid */
/* these are for training only */
double cache_size; /* in MB */
double eps; /* stopping criteria */
double C; /* for C_SVC, EPSILON_SVR and NU_SVR */
int nr_weight; /* for C_SVC */
int *weight_label; /* for C_SVC */
double* weight; /* for C_SVC */
double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */
double p; /* for EPSILON_SVR */
int shrinking; /* use the shrinking heuristics */
int probability; /* do probability estimates */
};
//
// svm_model
//
struct svm_model
{
struct svm_parameter param; /* parameter */
int nr_class; /* number of classes, = 2 in regression/one class svm */
int l; /* total #SV */
#ifdef _DENSE_REP
struct svm_node *SV; /* SVs (SV[l]) */
#else
struct svm_node **SV; /* SVs (SV[l]) */
#endif
double **sv_coef; /* coefficients for SVs in decision functions (sv_coef[k-1][l]) */
double *rho; /* constants in decision functions (rho[k*(k-1)/2]) */
double *probA; /* pariwise probability information */
double *probB;
int *sv_indices; /* sv_indices[0,...,nSV-1] are values in [1,...,num_traning_data] to indicate SVs in the training set */
/* for classification only */
int *label; /* label of each class (label[k]) */
int *nSV; /* number of SVs for each class (nSV[k]) */
/* nSV[0] + nSV[1] + ... + nSV[k-1] = l */
/* XXX */
int free_sv; /* 1 if svm_model is created by svm_load_model*/
/* 0 if svm_model is created by svm_train */
};
struct svm_model *svm_train(const struct svm_problem *prob, const struct svm_parameter *param);
void svm_cross_validation(const struct svm_problem *prob, const struct svm_parameter *param, int nr_fold, double *target);
int svm_save_model(const char *model_file_name, const struct svm_model *model);
struct svm_model *svm_load_model(const char *model_file_name);
int svm_get_svm_type(const struct svm_model *model);
int svm_get_nr_class(const struct svm_model *model);
void svm_get_labels(const struct svm_model *model, int *label);
void svm_get_sv_indices(const struct svm_model *model, int *sv_indices);
int svm_get_nr_sv(const struct svm_model *model);
double svm_get_svr_probability(const struct svm_model *model);
double svm_predict_values(const struct svm_model *model, const struct svm_node *x, double* dec_values);
double svm_predict(const struct svm_model *model, const struct svm_node *x);
double svm_predict_probability(const struct svm_model *model, const struct svm_node *x, double* prob_estimates);
void svm_free_model_content(struct svm_model *model_ptr);
void svm_free_and_destroy_model(struct svm_model **model_ptr_ptr);
void svm_destroy_param(struct svm_parameter *param);
const char *svm_check_parameter(const struct svm_problem *prob, const struct svm_parameter *param);
int svm_check_probability_model(const struct svm_model *model);
void svm_set_print_string_function(void (*print_func)(const char *));
#ifdef __cplusplus
}
#endif
#endif /* _LIBSVM_H */
================================================
FILE: src/windows/README-GPU
================================================
GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to
speed-up the training process. This package contains a new executable for
training classifiers "svm-train-gpu.exe" together with the original one.
The use of the new executable is exactly the same as with the original one.
This executable was built with the CUBLAS API version 2 which is compatible with SDKs from 4.0 and up.
FEATURES
Mode Supported
* c-svc classification with RBF kernel
Functionality / User interface
* Same as LIBSVM
PREREQUISITES
* NVIDIA Graphics card with CUDA support
* Latest NVIDIA drivers for GPU
* CUDA toolkit & GPU Computing SDK 5.5
Download all in one package from:
https://developer.nvidia.com/cuda-downloads
INSTRUCTIONS
1. Install the NVIDIA drivers, CUDA toolkit and GPU Computing SDK code samples. You can find them all in one package here:
https://developer.nvidia.com/cuda-downloads (Version 5.5)
2. Open the Visual studio 2010 project file located inside this folder and build.
Additional Information
======================
If you find GPU-Accelerated LIBSVM helpful, please cite it as
A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines",
Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011.
Software available at http://mklab.iti.gr/project/GPU-LIBSVM
================================================
FILE: src/windows/libsvm_train_dense_gpu/cross_validation_with_matrix_precomputation.c
================================================
void setup_pkm(struct svm_problem *p_km)
{
int i;
p_km->l = prob.l;
p_km->x = Malloc(struct svm_node,p_km->l);
p_km->y = Malloc(double,p_km->l);
for(i=0;ix+i)->values = Malloc(double,prob.l+1);
(p_km->x+i)->dim = prob.l+1;
}
for( i=0; iy[i] = prob.y[i];
}
void free_pkm(struct svm_problem *p_km)
{
int i;
for(i=0;ix+i)->values);
free( p_km->x );
free( p_km->y );
}
double do_crossvalidation(struct svm_problem * p_km)
{
double rate;
int i;
int total_correct = 0;
double total_error = 0;
double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0;
double *target = Malloc(double,prob.l);
svm_cross_validation(p_km,¶m,nr_fold,target);
if(param.svm_type == EPSILON_SVR ||
param.svm_type == NU_SVR)
{
for(i=0;i
#include "cublas_v2.h"
// Scalars
const float alpha = 1;
const float beta = 0;
void ckm( struct svm_problem *prob, struct svm_problem *pecm, float *gamma )
{
cublasStatus_t status;
double g_val = *gamma;
long int nfa;
int len_tv;
int ntv;
int i_v;
int i_el;
int i_r, i_c;
int trvei;
double *tv_sq;
double *v_f_g;
float *tr_ar;
float *tva, *vtm, *DP;
float *g_tva = 0, *g_vtm = 0, *g_DotProd = 0;
cudaError_t cudaStat;
cublasHandle_t handle;
status = cublasCreate(&handle);
len_tv = prob-> x[0].dim;
ntv = prob-> l;
nfa = len_tv * ntv;
tva = (float*) malloc ( len_tv * ntv* sizeof(float) );
vtm = (float*) malloc ( len_tv * sizeof(float) );
DP = (float*) malloc ( ntv * sizeof(float) );
tr_ar = (float*) malloc ( len_tv * ntv* sizeof(float) );
tv_sq = (double*) malloc ( ntv * sizeof(double) );
v_f_g = (double*) malloc ( ntv * sizeof(double) );
for ( i_r = 0; i_r < ntv ; i_r++ )
{
for ( i_c = 0; i_c < len_tv; i_c++ )
tva[i_r * len_tv + i_c] = (float)prob-> x[i_r].values[i_c];
}
cudaStat = cudaMalloc((void**)&g_tva, len_tv * ntv * sizeof(float));
if (cudaStat != cudaSuccess) {
free( tva );
free( vtm );
free( DP );
free( v_f_g );
free( tv_sq );
cudaFree( g_tva );
cublasDestroy( handle );
fprintf (stderr, "!!!! Device memory allocation error (A)\n");
getchar();
return;
}
cudaStat = cudaMalloc((void**)&g_vtm, len_tv * sizeof(float));
cudaStat = cudaMalloc((void**)&g_DotProd, ntv * sizeof(float));
for( i_r = 0; i_r < ntv; i_r++ )
for( i_c = 0; i_c < len_tv; i_c++ )
tr_ar[i_c * ntv + i_r] = tva[i_r * len_tv + i_c];
// Copy cpu vector to gpu vector
status = cublasSetVector( len_tv * ntv, sizeof(float), tr_ar, 1, g_tva, 1 );
free( tr_ar );
for( i_v = 0; i_v < ntv; i_v++ )
{
tv_sq[ i_v ] = 0;
for( i_el = 0; i_el < len_tv; i_el++ )
tv_sq[i_v] += pow( tva[i_v*len_tv + i_el], (float)2.0 );
}
for ( trvei = 0; trvei < ntv; trvei++ )
{
status = cublasSetVector( len_tv, sizeof(float), &tva[trvei * len_tv], 1, g_vtm, 1 );
status = cublasSgemv( handle, CUBLAS_OP_N, ntv, len_tv, &alpha, g_tva, ntv , g_vtm, 1, &beta, g_DotProd, 1 );
status = cublasGetVector( ntv, sizeof(float), g_DotProd, 1, DP, 1 );
for ( i_c = 0; i_c < ntv; i_c++ )
v_f_g[i_c] = exp( -g_val * (tv_sq[trvei] + tv_sq[i_c]-((double)2.0)* (double)DP[i_c] ));
pecm-> x[trvei].values[0] = trvei + 1;
for ( i_c = 0; i_c < ntv; i_c++ )
pecm-> x[trvei].values[i_c + 1] = v_f_g[i_c];
}
free( tva );
free( vtm );
free( DP );
free( v_f_g );
free( tv_sq );
cudaFree( g_tva );
cudaFree( g_vtm );
cudaFree( g_DotProd );
cublasDestroy( handle );
}
void cal_km( struct svm_problem * p_km)
{
float gamma = param.gamma;
ckm(&prob, p_km, &gamma);
}
================================================
FILE: src/windows/libsvm_train_dense_gpu/libsvm_train_dense_gpu.vcxproj
================================================
Debug
Win32
Debug
x64
Release
Win32
Release
x64
{80730853-7F34-44C7-BBF7-496977A4C14F}
libsvm_289_dense
Win32Proj
Application
Unicode
true
Application
Unicode
Application
Unicode
true
Application
Unicode
<_ProjectFileVersion>10.0.40219.1
$(SolutionDir)$(Configuration)\
$(Configuration)\
true
$(SolutionDir)$(Platform)\$(Configuration)\
$(Platform)\$(Configuration)\
true
$(SolutionDir)$(Configuration)\
$(Configuration)\
false
$(SolutionDir)$(Platform)\$(Configuration)\
$(Platform)\$(Configuration)\
false
$(CUDA_WIN32_LIBS);$(LibraryPath)
$(CUDA_INC_PATH);C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.5\common\inc;$(IncludePath)
Disabled
$(CUDA_INC_PATH);%(AdditionalIncludeDirectories)
WIN32;_DEBUG;_CONSOLE;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions)
true
EnableFastChecks
MultiThreadedDebugDLL
Level3
EditAndContinue
cublas.lib;cudart.lib;%(AdditionalDependencies)
$(CUDA_WIN32_LIBS);%(AdditionalLibraryDirectories)
true
Console
false
MachineX86
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.5\common\inc;%(Include)
X64
Disabled
$(CUDA_INC_PATH);C:/Documents and Settings/All Users/Application Data/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/common/inc;%(AdditionalIncludeDirectories)
WIN32;_DEBUG;_CONSOLE;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions)
true
EnableFastChecks
MultiThreadedDebugDLL
Level3
ProgramDatabase
cublas.lib;cudart.lib;%(AdditionalDependencies)
$(CUDA_LIB_PATH);%(AdditionalLibraryDirectories)
true
Console
false
MachineX64
MaxSpeed
Default
false
Neither
true
$(CUDA_INC_PATH);%(AdditionalIncludeDirectories)
WIN32;NDEBUG;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions)
Sync
MultiThreadedDLL
true
NotSet
Precise
AssemblyAndMachineCode
Level3
ProgramDatabase
false
cublas.lib;cudart.lib;cuda.lib;%(AdditionalDependencies)
$(CUDA_LIB_PATH);%(AdditionalLibraryDirectories)
true
Console
true
true
false
MachineX86
X64
/Oa %(AdditionalOptions)
MaxSpeed
Default
false
Neither
true
$(CUDA_INC_PATH);C:/Documents and Settings/All Users/Application Data/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/common/inc;%(AdditionalIncludeDirectories)
WIN32;NDEBUG;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions)
Sync
MultiThreadedDLL
true
NotSet
Precise
AssemblyAndMachineCode
Level3
ProgramDatabase
cublas.lib;cudart.lib;cuda.lib;%(AdditionalDependencies)
$(CUDA_X64_LIBS);%(AdditionalLibraryDirectories)
true
Console
true
true
false
MachineX64
true
true
true
false
true
true
true
true
true
================================================
FILE: src/windows/libsvm_train_dense_gpu/libsvm_train_dense_gpu.vcxproj.filters
================================================
{4FC737F1-C7A5-4376-A066-2A32D752A2FF}
cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx
{93995380-89BD-4b04-88EB-625FBE52EBFB}
h;hpp;hxx;hm;inl;inc;xsd
{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}
rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav
Source Files
Source Files
Source Files
Source Files
Source Files
Header Files
================================================
FILE: src/windows/libsvm_train_dense_gpu/libsvm_train_dense_gpu.vcxproj.user
================================================
================================================
FILE: src/windows/libsvm_train_dense_gpu/svm-train.c
================================================
#include
#include
#include
#include
#include
#include "svm.h"
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
void print_null(const char *s) {}
void exit_with_help()
{
printf(
"Usage: svm-train [options] training_set_file [model_file]\n"
"options:\n"
"-s svm_type : set type of SVM (default 0)\n"
" 0 -- C-SVC (multi-class classification)\n"
" 1 -- nu-SVC (multi-class classification)\n"
" 2 -- one-class SVM\n"
" 3 -- epsilon-SVR (regression)\n"
" 4 -- nu-SVR (regression)\n"
"-t kernel_type : set type of kernel function (default 2)\n"
" 0 -- linear: u'*v\n"
" 1 -- polynomial: (gamma*u'*v + coef0)^degree\n"
" 2 -- radial basis function: exp(-gamma*|u-v|^2)\n"
" 3 -- sigmoid: tanh(gamma*u'*v + coef0)\n"
" 4 -- precomputed kernel (kernel values in training_set_file)\n"
"-d degree : set degree in kernel function (default 3)\n"
"-g gamma : set gamma in kernel function (default 1/num_features)\n"
"-r coef0 : set coef0 in kernel function (default 0)\n"
"-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)\n"
"-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)\n"
"-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)\n"
"-m cachesize : set cache memory size in MB (default 100)\n"
"-e epsilon : set tolerance of termination criterion (default 0.001)\n"
"-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)\n"
"-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)\n"
"-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)\n"
"-v n: n-fold cross validation mode\n"
"-q : quiet mode (no outputs)\n"
);
exit(1);
}
void exit_input_error(int line_num)
{
fprintf(stderr,"Wrong input format at line %d\n", line_num);
exit(1);
}
void parse_command_line(int argc, char **argv, char *input_file_name, char *model_file_name);
void read_problem(const char *filename);
void do_cross_validation();
struct svm_parameter param; // set by parse_command_line
struct svm_problem prob; // set by read_problem
struct svm_model *model;
struct svm_node *x_space;
int cross_validation;
int nr_fold;
static char *line = NULL;
static int max_line_len;
#include "kernel_matrix_calculation.c"
#include "cross_validation_with_matrix_precomputation.c"
static char* readline(FILE *input)
{
int len;
if(fgets(line,max_line_len,input) == NULL)
return NULL;
while(strrchr(line,'\n') == NULL)
{
max_line_len *= 2;
line = (char *) realloc(line,max_line_len);
len = (int) strlen(line);
if(fgets(line+len,max_line_len-len,input) == NULL)
break;
}
return line;
}
int main(int argc, char **argv)
{
int i;
char input_file_name[1024];
char model_file_name[1024];
const char *error_msg;
parse_command_line(argc, argv, input_file_name, model_file_name);
read_problem(input_file_name);
error_msg = svm_check_parameter(&prob,¶m);
if(error_msg)
{
fprintf(stderr,"ERROR: %s\n",error_msg);
exit(1);
}
if(cross_validation)
{
do_cross_validation_with_KM_precalculated( );
// do_cross_validation();
}
else
{
model = svm_train(&prob,¶m);
if(svm_save_model(model_file_name,model))
{
fprintf(stderr, "can't save model to file %s\n", model_file_name);
exit(1);
}
svm_free_and_destroy_model(&model);
}
svm_destroy_param(¶m);
free(prob.y);
#ifdef _DENSE_REP
for (i = 0; i < prob.l; ++i)
free((prob.x+i)->values);
#else
free(x_space);
#endif
free(prob.x);
free(line);
return 0;
}
void do_cross_validation()
{
int i;
int total_correct = 0;
double total_error = 0;
double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0;
double *target = Malloc(double,prob.l);
svm_cross_validation(&prob,¶m,nr_fold,target);
if(param.svm_type == EPSILON_SVR ||
param.svm_type == NU_SVR)
{
for(i=0;i=argc)
exit_with_help();
switch(argv[i-1][1])
{
case 's':
param.svm_type = atoi(argv[i]);
break;
case 't':
param.kernel_type = atoi(argv[i]);
break;
case 'd':
param.degree = atoi(argv[i]);
break;
case 'g':
param.gamma = atof(argv[i]);
break;
case 'r':
param.coef0 = atof(argv[i]);
break;
case 'n':
param.nu = atof(argv[i]);
break;
case 'm':
param.cache_size = atof(argv[i]);
break;
case 'c':
param.C = atof(argv[i]);
break;
case 'e':
param.eps = atof(argv[i]);
break;
case 'p':
param.p = atof(argv[i]);
break;
case 'h':
param.shrinking = atoi(argv[i]);
break;
case 'b':
param.probability = atoi(argv[i]);
break;
case 'q':
print_func = &print_null;
i--;
break;
case 'v':
cross_validation = 1;
nr_fold = atoi(argv[i]);
if(nr_fold < 2)
{
fprintf(stderr,"n-fold cross validation: n must >= 2\n");
exit_with_help();
}
break;
case 'w':
++param.nr_weight;
param.weight_label = (int *)realloc(param.weight_label,sizeof(int)*param.nr_weight);
param.weight = (double *)realloc(param.weight,sizeof(double)*param.nr_weight);
param.weight_label[param.nr_weight-1] = atoi(&argv[i-1][2]);
param.weight[param.nr_weight-1] = atof(argv[i]);
break;
default:
fprintf(stderr,"Unknown option: -%c\n", argv[i-1][1]);
exit_with_help();
}
}
svm_set_print_string_function(print_func);
// determine filenames
if(i>=argc)
exit_with_help();
strcpy(input_file_name, argv[i]);
if(i line)
p--;
if(p > line)
max_index = (int) strtol(p,&endptr,10) + 1;
}
if(max_index > elements)
elements = max_index;
++prob.l;
}
rewind(fp);
prob.y = Malloc(double,prob.l);
prob.x = Malloc(struct svm_node,prob.l);
for(i=0;ivalues = Malloc(double,elements);
(prob.x+i)->dim = 0;
inst_max_index = -1; // strtol gives 0 if wrong format, and precomputed kernel has start from 0
readline(fp);
label = strtok(line," \t");
prob.y[i] = strtod(label,&endptr);
if(endptr == label)
exit_input_error(i+1);
while(1)
{
idx = strtok(NULL,":");
val = strtok(NULL," \t");
if(val == NULL)
break;
errno = 0;
j = (int) strtol(idx,&endptr,10);
if(endptr == idx || errno != 0 || *endptr != '\0' || j <= inst_max_index)
exit_input_error(i+1);
else
inst_max_index = j;
errno = 0;
value = strtod(val,&endptr);
if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr)))
exit_input_error(i+1);
d = &((prob.x+i)->dim);
while (*d < j)
(prob.x+i)->values[(*d)++] = 0.0;
(prob.x+i)->values[(*d)++] = value;
}
}
max_index = elements-1;
#else
while(readline(fp)!=NULL)
{
char *p = strtok(line," \t"); // label
// features
while(1)
{
p = strtok(NULL," \t");
if(p == NULL || *p == '\n') // check '\n' as ' ' may be after the last feature
break;
++elements;
}
++elements;
++prob.l;
}
rewind(fp);
prob.y = Malloc(double,prob.l);
prob.x = Malloc(struct svm_node *,prob.l);
x_space = Malloc(struct svm_node,elements);
max_index = 0;
j=0;
for(i=0;i start from 0
readline(fp);
prob.x[i] = &x_space[j];
label = strtok(line," \t\n");
if(label == NULL) // empty line
exit_input_error(i+1);
prob.y[i] = strtod(label,&endptr);
if(endptr == label || *endptr != '\0')
exit_input_error(i+1);
while(1)
{
idx = strtok(NULL,":");
val = strtok(NULL," \t");
if(val == NULL)
break;
errno = 0;
x_space[j].index = (int) strtol(idx,&endptr,10);
if(endptr == idx || errno != 0 || *endptr != '\0' || x_space[j].index <= inst_max_index)
exit_input_error(i+1);
else
inst_max_index = x_space[j].index;
errno = 0;
x_space[j].value = strtod(val,&endptr);
if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr)))
exit_input_error(i+1);
++j;
}
if(inst_max_index > max_index)
max_index = inst_max_index;
x_space[j++].index = -1;
}
#endif
if(param.gamma == 0 && max_index > 0)
param.gamma = 1.0/max_index;
if(param.kernel_type == PRECOMPUTED)
for(i=0;idim == 0 || (prob.x+i)->values[0] == 0.0)
{
fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n");
exit(1);
}
if ((int)(prob.x+i)->values[0] < 0 || (int)(prob.x+i)->values[0] > max_index)
{
fprintf(stderr,"Wrong input format: sample_serial_number out of range\n");
exit(1);
}
#else
if (prob.x[i][0].index != 0)
{
fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n");
exit(1);
}
if ((int)prob.x[i][0].value <= 0 || (int)prob.x[i][0].value > max_index)
{
fprintf(stderr,"Wrong input format: sample_serial_number out of range\n");
exit(1);
}
#endif
}
fclose(fp);
}
================================================
FILE: src/windows/libsvm_train_dense_gpu/svm.cpp
================================================
#include
#include
#include
#include
#include
#include
#include
#include
#include