Smoothing Technique for Statistical Language Model Based on Global Discount
CSTR:
Author:
Affiliation:

Clc Number:

TP181

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Smoothing techniques are mainly used to solve the problem of sparse data for statistical language model. The present smoothing techniques deal with the data sparse problem using different discount and compensate strategy, and they have different merit or shortcoming on complexity and rationality. This paper presents a new kind of smoothing technique based on global discount for Bi-gram model. The model parameters, probabilities for bigram, are discounted according to frequency of bigram, and are compensated according to lower-level model for unseen events in the model, whose rationality is indicated by minimizing the perplexity. Experiment results show that the technique is superior to commonly used Katz smoothing technique.

    Reference
    Related
    Cited by
Get Citation

黄永文,何中市.基于全局折扣的统计语言模型平滑技术[J].重庆大学学报,2005,28(8):51~55

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 05,2005
  • Revised:April 05,2005
  • Adopted:
  • Online:
  • Published:
Article QR Code