结合Transformer模型与深度神经网络的数据到文本生成方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家重点研发计划资助项目(2018YFB1402500)。


Research on data-to-text generation based on transformer model and deep neural network
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    数据到文本的生成是指从结构化数据生成连贯文本的一种自然语言处理方法。近年来,由于端到端训练的深度神经网络的应用,数据到文本生成的方法显示出了巨大潜力。该方法能够处理大量数据自动生成连贯性文本,常用于新闻写作、报告生成等场景。然而,已有研究中对于数据中具体数值、时间等数据信息的推理存在较大缺陷,无法充分利用数据间的结构信息给出合理的生成指引,并且生成过程容易出现语义与句法分离训练的问题。因此,文中提出一种结合Transformer模型与深度神经网络的数据到文本生成方法,并提出一个用于内容规划的Transformer Text Planning(TTP)算法,有效地解决上述问题。在Rotowire公开数据集上进行方法验证,实验结果表明,文中方法性能优于已有数据到文本生成模型,可直接应用于结构化数据到连贯性文本的生成任务中,具有一定的实际应用价值。

    Abstract:

    Data-to-text generation is a natural language processing method that generates coherent text from structured data. In recent years, data-to-text generation have shown great promise of profit due to the popular neural network architectures which are trained end-to-end. This method can automatically process large amounts of data and generate coherent text and is often used in news writing, report generation, etc. However, there are some defects in the reasoning of information such as the data of specific value and time in the existing researches, which make it unable to make full use of the structural information of data to provide reasonable guidance for the generation. Beyond that the generation process is prone to separate semantic from syntactic when training. In this paper, a data-to-text generation method based on transformer model and deep neural network was proposed, and the algorithm of transformer text planning(TTP) was also introduced so as to effectively control the context information of the generated text and remove the deficiencies of the previous model that resulted in semantics and syntax separation. Experiment results on the Rotowire public dataset show that the method proposed outperforms the existing model and it can be directly applied to the generation task of scattered data to coherent text.

    参考文献
    相似文献
    引证文献
引用本文

许晓泓,何霆,王华珍,陈坚.结合Transformer模型与深度神经网络的数据到文本生成方法[J].重庆大学学报,2020,43(7):91-100.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-12-09
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2020-07-18
  • 出版日期:
文章二维码