Jianbin Ma
College of Information Science and Technology, Agricultural University of Hebei, Hebei, 071001, Baoding, China
Ying Li
College of Economic and Trade, Agricultural University of Hebei, Hebei, 071001, Baoding, China
ABSTRACT
Support Vector Machine (SVM) algorithm is applied to text classification widely. However, SVMs limitation is that it is difficult to label samples rightly if available training samples are small. So TSVM (Transductive Support Vector Machine) was introduced to minimize misclassification of test samples via., training on labeled and unlabeled samples. However, in the training process of TSVM, the parameter N (the number of positive samples) should be inputted artificially. The parameter N is difficult to estimate. In this study, PSTSVM (Progressive Similarity Transductive Support Vector Machine) was introduced which labeled most likely unlabeled samples pairwise by similarity computing and then retrained to readjust the hyperplane. The experimental results on Reuters dataset showed that PSTSVM algorithm was effective on a mixed training set of unlabeled samples and labeled samples.
PDF References Citation
How to cite this article
Jianbin Ma and Ying Li, 2013. Progressive Similarity Transductive Support Vector Machine Algorithm for Small
Sample Text Classification. Information Technology Journal, 12: 7673-7676.
DOI: 10.3923/itj.2013.7673.7676
URL: https://scialert.net/abstract/?doi=itj.2013.7673.7676
DOI: 10.3923/itj.2013.7673.7676
URL: https://scialert.net/abstract/?doi=itj.2013.7673.7676
REFERENCES
- Chen, Y.S., G.P. Wang and S. Dong, 2003. Learning with progressive transductive support vector machine. Pattern Recogn. Lett., 24: 1845-1855.
CrossRefDirect Link - Drucker, H., D. Wu and V.N. Vapnik, 1999. Support vector machines for spam categorization. IEEE Trans. Neural Network, 10: 1048-1054.
CrossRefDirect Link - Joachims, T., 1998. Text categorization with support vector machines: Learning with many relevant features. Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany, April 21-23, 1998, Springer, Berlin, Heidelberg, pp: 137-142.
CrossRefDirect Link - Joachims, T., 1999. Transductive inference for text classification using support vector machines. Proceedings of the 16th International Conference on Machine Learning, June 27-30, 1999, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA., pp: 200-209.
Direct Link - Ma, J.B., G.F. Teng, Y.X. Zhang, Y.L. Li and Y. Li, 2009. A cybercrime forensic method for chinese web information authorship analysis. Proceedings of 2009 Pacific Asia Workshop on Intelligence and Security Informatics, April 27, 2009, Bangkok, Thailand, pp: 14-24.
CrossRef - Ren, G.B., J. Zhang, Y. Ma and P.J. Song, 2010. An unlabeled samples labeling method of TSVM for remote sensing image. Proceedings of the 3rd IEEE International Conference on Computer Science and Information Technology, July 9-11, 2010, Chengdu, China, pp: 286-290.
CrossRef - Wang, Y. and Z. Gong, 2008. Hierarchical classification of web pages using support vector machine. Proceedings of the 11th International Conference on Asian Digital Libraries, December 2-5, 2008, Bali, Indonesia, pp: 12-21.
CrossRef
Thomas Koch Reply
Hello, I was wondering if I could look at the code you used to generate these results, or a pre-compiled program with instructions on how to use it. I would like to test this algorithm on a different dataset. Thanks!