Abstract:Channel mismatch and incomplete acquisition of voiceprint features under short speech or noise conditions are two thorny problems for voiceprint recognition. This paper proposes a solution that combines traditional techniques with deep learning: A I-Vector model was used as the teacher model to conduct knowledge distillation of the student model ResNet, a ResNet network based on metric learning was constructed, including an attentive statistics pooling layer to capture and emphasize the critical information of voiceprint features and improve the distinguishability of voiceprint features, and the mean square error (MSE) was combined with the loss based on metric learning to reduce computational complexity and enhance model learning capabilities. The trained model was then used for the voiceprint recognition test. Compared with the voiceprint recognition model under various deep learning methods, the equal error rate (EER) was the lowest, and the equal error rate reached 3.229%, indicating that the model can perform voiceprint recognition more effectively.