Spectrogram texture pattern guided vocal separation
DOI:
CSTR:
Author:
Affiliation:

tianjin university

Clc Number:

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan),

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The spectral patterns of vocal and accompaniment have their own unique textures, but the spectral lines of vocal and accompaniment often overlap and intertwine on the spectrogram, making it very difficult to separate vocal and accompaniment from mono audio. Therefore, a stacked hourglass network that integrates multi-resolution attention and multi-channel cross-attention is proposed to finely characterize the texture features of the spectral lines of vocals and accompaniment. First, in response to the differences in spectral line density between vocal and accompaniment in the frequency dimension of the spectrogram, multi-resolution attention is applied to the features of different resolutions in the decoder, so as to utilize the appropriate resolution to represent the time-frequency texture patterns of vocal and accompaniment. Secondly, multi-channel cross-attention is proposed to better represent the instantaneous time-frequency characteristics in the frequency dimension and the flat sustained characteristics in the time dimension of the accompaniment, effectively extracting the spectrogram features of the accompaniment. Experimental results on the MIR-1K dataset show that compared with the current state-of-the-art model SHN, the number of parameters is reduced by about 33%, the vocal signal-to-noise ratio (GNSDR) index is improved by 1.35 dB, and the accompaniment is improved by 0.89 dB. The experimental results prove that a full representation of the spectrogram features of different sound sources can further improve the separation effect of vocal and accompaniment.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 20,2024
  • Revised:November 24,2024
  • Adopted:February 13,2025
  • Online:
  • Published:
Article QR Code