Markov逻辑网在重复数据删除中的应用
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

重庆市自然科学基金资助项目(CSTC 2008BB2021);中国博士后科学基金资助项目(20070420711)


Markov Logic Networks with its application in Deduplication
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了解决和突破现阶段重复数据删除方法大多只能针对特定领域,孤立地解决问题的某个方面所带来的不足和局限,提出了基于Markov逻辑网的统计关系学习方法.该方法可以通过计算一个世界的概率分布来为推理服务,从而可将重复数据删除问题形式化.具体采用了判别式训练的学习算法和MCSAT推理算法,并详细阐述了如何用少量的谓词公式来描述重复数据删除问题中不同方面的本质特征,将Markov逻辑表示的各方面组合起来形成各种模型.实验结果表明基于Markov逻辑网的重复数据删除方法不但可以涵盖经典的FellegiSunte

    Abstract:

    In order to solve the limitation that the traditional Deduplications are mostly used for a specific field and only address one aspect of a problem,a scheme based on Markov Logic Networks (MLNs)is proposed, which is a new Statistical Relational Learning (SRL) model. With its advantage of computing the probability distribution of worlds to serve for the inference, the Deduplication is formalized. Discriminative learning algorithm is adopted for Markov Logic Networks weights, MCSAT algorithm is adopted for inference. It shows how to capture the essential features of different aspects in Deduplication with a small number of predicate rules and also combines these rules together to compose all kinds of model. The experiment results prove that the method based on Markov Logic Networks not only covers the original FellegiSunter model, but also achieves a better result than the traditional methods based on Clustering Algorithms and Similarity Measures in Deduplication. It reveals that the Markov Logic Networks can play an important part in practical application.

    参考文献
    相似文献
    引证文献
引用本文

张玉芳,黄涛,艾东梅,熊忠阳,唐蓉君. Markov逻辑网在重复数据删除中的应用[J].重庆大学学报,2010,33(8):36-41.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2010-01-02
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期:
  • 出版日期: