An Industry Evaluation of Embedding-based Entity Alignment

Ziheng Zhang, Hualuo Liu, Jiaoyan Chen, Xi Chen, Bo Liu, YueJia Xiang, Yefeng Zheng


Abstract
Embedding-based entity alignment has been widely investigated in recent years, but most proposed methods still rely on an ideal supervised learning setting with a large number of unbiased seed mappings for training and validation, which significantly limits their usage. In this study, we evaluate those state-of-the-art methods in an industrial context, where the impact of seed mappings with different sizes and different biases is explored. Besides the popular benchmarks from DBpedia and Wikidata, we contribute and evaluate a new industrial benchmark that is extracted from two heterogeneous knowledge graphs (KGs) under deployment for medical applications. The experimental results enable the analysis of the advantages and disadvantages of these alignment methods and the further discussion of suitable strategies for their industrial deployment.
Anthology ID:
2020.coling-industry.17
Volume:
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track
Month:
December
Year:
2020
Address:
Online
Editors:
Ann Clifton, Courtney Napoles
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
179–189
Language:
URL:
https://aclanthology.org/2020.coling-industry.17/
DOI:
10.18653/v1/2020.coling-industry.17
Bibkey:
Cite (ACL):
Ziheng Zhang, Hualuo Liu, Jiaoyan Chen, Xi Chen, Bo Liu, YueJia Xiang, and Yefeng Zheng. 2020. An Industry Evaluation of Embedding-based Entity Alignment. In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track, pages 179–189, Online. International Committee on Computational Linguistics.
Cite (Informal):
An Industry Evaluation of Embedding-based Entity Alignment (Zhang et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-industry.17.pdf
Code
 ZihengZZH/industry-eval-EA
Data
DBpediaYAGO