DATA MINING AND TEXT MINING: EFFICIENT TEXT CLASSIFICATION USING SVMS FOR LARGE DATASETS

Authors

  • Srikanth Bethu*, B Sankara Babu Author

Keywords:

Data mining, Dual problem, Learning algorithm, Primal problem, Support Vector Machines (SVM), Text mining.

Abstract

The Text mining and Data mining supports different kinds of algorithms for classification of large data sets. The Text Categorization is traditionally done by using the Term Frequency and Inverse Document Frequency. This method does not satisfy elimination of unimportant words in the document. For reducing the error classifying of documents in wrong category, efficient classification algorithms are needed. Support Vector Machines (SVM) is used based on the large margin data sets for classification algorithms that give good generalization, compactness and performance. Support Vector Machines (SVM) provides low accuracy and to solve large data sets, it typically needs large number of support vectors. We introduce a new learning algorithm, which is comfortable to solve the dual problem, by adding the support vectors incrementally. It majorly involves a classification algorithm by solving the primal problem instead of the dual problem. By using this, we are able to reduce the resultant classifier complexity by comparing with the existing works. Experimental results done and produce comparable classification accuracy with existing works.

Downloads

Published

2016-08-30

Issue

Section

Articles