Abstract:In order to solve the limitation that the traditional Deduplications are mostly used for a specific field and only address one aspect of a problem,a scheme based on Markov Logic Networks (MLNs)is proposed, which is a new Statistical Relational Learning (SRL) model. With its advantage of computing the probability distribution of worlds to serve for the inference, the Deduplication is formalized. Discriminative learning algorithm is adopted for Markov Logic Networks weights, MCSAT algorithm is adopted for inference. It shows how to capture the essential features of different aspects in Deduplication with a small number of predicate rules and also combines these rules together to compose all kinds of model. The experiment results prove that the method based on Markov Logic Networks not only covers the original FellegiSunter model, but also achieves a better result than the traditional methods based on Clustering Algorithms and Similarity Measures in Deduplication. It reveals that the Markov Logic Networks can play an important part in practical application.