|
Text Classification with Labelled and Unlabelled Data |
|
| Location: Methodology/SVM | |
| Home
Methodology
|
Methodology |
Support Vector Machines are classifiers that, given a training set of labelled feature vectors, try to find a decision function that separates the data with the maximum possible margin. In other words, they try to find the Optimum Separating Hyperplane (OSH). They are binary classifiers, which means they separate data into two groups, one that belongs to class A (the category for which we are training the SVM), and the other consisting of feature vectors that do not belong to class A (there are also techniques that combine SVMs so they can be used for multi-labelled classification, see the thesis for more details). Note that the picture shows the "best case" where the data can be separated without errors. Of course, in real life it is often the case that classification errors exist, i.e. some feature vectors get assigned to the "wrong" side of the hyperplane.
|
|
|
Figure 1. A Support Vector Machine |
|