SIGMa

11 August 2013

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

p. 572-580
https://doi.org/10.1145/2487575.2487592

Abstract

The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information. Tools for automatically aligning these knowledge bases would make it possible to unify many sources of structured knowledge and answer complex queries. However, the efficient alignment of large-scale knowledge bases still poses a considerable challenge. Here, we present Simple Greedy Matching (SiGMa), a simple algorithm for aligning knowledge bases with millions of entities and facts. SiGMa is an iterative propagation algorithm that leverages both the structural information from the relationship graph and flexible similarity measures between entity properties in a greedy local search, which makes it scalable. Despite its greedy nature, our experiments indicate that SiGMa can efficiently match some of the world's largest knowledge bases with high accuracy. We provide additional experiments on benchmark datasets which demonstrate that SiGMa can outperform state-of-the-art approaches both in accuracy and efficiency.

Keywords

This publication has 20 references indexed in Scilit:

LINDA
Published by Association for Computing Machinery (ACM) ,2012
PARIS
Proceedings of the VLDB Endowment, 2011
Ten Challenges for Ontology Matching
Lecture Notes in Computer Science, 2008
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery From Data, 2007
A survey on ontology mapping
ACM SIGMOD Record, 2006
Word alignment via quadratic assignment
Published by Association for Computational Linguistics (ACL) ,2006
Constraint Propagation
Published by Elsevier BV ,2006
A Systematic Comparison of Various Statistical Alignment Models
Computational Linguistics, 2003
Ontology mapping: the state of the art
The Knowledge Engineering Review, 2003
The Quadratic Assignment Problem
Management Science, 1963

Cited by 73 articles