Fuzzy webpage text classification algorithm combined with improved NMF
Article
Figures
Metrics
Preview PDF
Reference
Related
Cited by
Materials
Abstract:
An item-document weight matrix representing the web pages could be generated by constructing the vector space model. Since the efficiency of direct classification through the high-dimensional matrix is relatively low, a fuzzy webpage text classification algorithm combined with improved nonnegative matrix factorization (NMF) is presented. Firstly, the original high-dimensional data are mapped into the low-dimensional semantic space via an iterative normalized compression NMF(NCMF) to reduce the complexity of the problem. Secondly, in order to solve the problem of categorizing ambiguous words by using deterministic matrices, fuzzy logic is incorporated into the classification model, where the fuzzy categorization set of the document is constructed with the fuzzy membership degree between features and categories. Comparative experiment results demonstrate the proposed classification algorithm possesses higher accuracy and better time performance.