Abstract:Sentence sentiment classification is an important task for extracting emotional semantics from text. Currently, the best tools for sentence sentiment classification leverage deep neural networks, particularly BERT-based models. However, these models require large, high-quality labeled datasets to perform effectively. In practice, labeled data is usually limited, leading to overfitting on small datasets and difficulties in capturing subtle sentiment features. Although existing semi-supervised models utilize features from large unlabeled datasets, they still face challenges from errors introduced by pseudo-labeled samples. Additionally, once test data is labeled, these models often do not adapt by incorporating feature information from test data. To address these issues, this paper proposes a semi-supervised sentence sentiment classification model. First, a K-nearest neighbors-based weighting mechanism is designed, assigning higher weights to high confidence samples to minimize error propagation during parameter learning. Second, a two-stage training mechanism is implemented, enabling the model to correct misclassified samples in the test data. Extensive experiments on multiple datasets show that this method achieves strong performance on small datasets.