With the continuous growth of modern archive management data, the effective clustering of archive text can significantly improve the efficiency of archive classification and retrieval. This paper proposes two incremental multi-modal text data clustering methods. By multi-perspective analysis of the text content, the potential topic features of texts are integrated to improve the accuracy of text clustering. In addition, the corresponding incremental multi-modal feature learning models for text clustering are designed to improve the efficiency of massive and dynamic text partition. Experimental results on real-world text data sets show that the proposed incremental multimodal text clustering methods outperform the compared stated-of-the-art methods, being able to effectively classify text data.